CN105931643A

CN105931643A - Speech recognition method and apparatus

Info

Publication number: CN105931643A
Application number: CN201610509498.3A
Authority: CN
Inventors: 杨子剑
Original assignee: Qingdao Haier Intelligent Home Appliance Technology Co Ltd; Beijing Haier Guangke Digital Technology Co Ltd
Current assignee: Qingdao Haier Intelligent Home Appliance Technology Co Ltd; Haier Uplus Intelligent Technology Beijing Co Ltd
Priority date: 2016-06-30
Filing date: 2016-06-30
Publication date: 2016-09-07

Abstract

The invention discloses a speech recognition method and apparatus. The speech recognition method includes the steps of obtaining the position information of a user through a client, loading corresponding dialect speech base corresponding to a regional dialect according to the position information, obtaining speech information input by a user, calling the dialect speech base, and parsing and identifying the speech information by using a speech parsing algorithm corresponding to the dialect speech base. According to the technical scheme disclosed in the invention, the speech recognition rate and speech parsing accuracy of different regional dialects can be improved, the use experience can be improved, and user group can be expanded.

Description

Audio recognition method and device

Technical field

The present invention relates to field of computer technology, particularly relate to a kind of audio recognition method and device.

Background technology

Current speech technology is the most ripe, and the every field in life all has use.In mobile field, can wear Wear equipment, various application (APP) and all using voice technology, give user and preferably experience.Baidu's language Sound assistant is the application of the successful a use voice technology of domestic contrast, and user installs Baidu's language on mobile phone Sound assistant, after starting application, user is talked by mobile phone microphone, and voice assistant intercepts and captures the voice that user sends After, by phonetic algorithm, parse the content that user says, then automatically by Baidu's search engine, search User wants the content with search, and presents result.

But speech analysis algorithm at present, requires higher to the pronunciation standard of user, it is necessary to use the general of standard Call pronunciation.But the Mandarin Chinese speech of each department, all with dense accent, to speech analysis band Come constant, cause the accuracy rate of speech analysis to be substantially reduced, thus limit voice technology extensively application and The experience of user.

It is to say, existing speech analysis algorithm, it is only applicable to Mandarin Chinese speech, when pronouncing nonstandard, The accuracy rate resolved can drastically decline.The peculiar parameter that speech analysis is conventional has: formant amplitude and frequency, It is several regions that in voice short-time rating spectrum, energy is concentrated, and the parameter of formant entirely defines in pronunciation The attribute of vowel.Due to areal variation, various places Mandarin Chinese speech with dense accent, therefore voice Formant amplitude and frequency, have the biggest difference with default mandarin parameter, cause the inaccurate of speech analysis Really.

Existing internet of things home appliance Voice command, also has the pronunciation of user and compares strict requirements, need to make By the mandarin of standard of comparison, control instruction can be resolved accurately.In actual applications, various places user Generally with local phonetic aspect of a dialect pronunciation, " blue or green general ", " Shan is general ", " river is general " etc. are then occurred as soon as with this The mandarin of ground characteristic.Existing speech analysis algorithm pronounces for this type of, resolves the degree of accuracy the highest, no It is beneficial to the universal and use that intelligent home appliance voice controls.

Summary of the invention

In view of prior art resolves, for non-type mandarin, the problem that the degree of accuracy is the highest, it is proposed that this Bright to provide a kind of audio recognition method overcoming the problems referred to above or solving the problems referred to above at least in part And device.

The present invention provides a kind of audio recognition method, including:

Obtained the positional information at user place by client, load corresponding territorial dialect phase according to positional information Corresponding dialect phonetic storehouse；

Obtained the voice messaging of user's input by client, call dialect phonetic storehouse, and use and dialect language The speech analysis algorithm that sound storehouse is corresponding, carries out voice messaging resolving identification.

Present invention also offers a kind of speech recognition equipment, including:

Load-on module, for being obtained the positional information at user place by client, is loaded according to positional information The dialect phonetic storehouse that corresponding territorial dialect is corresponding；

Resolve identification module, for being obtained the voice messaging of user's input by client, call dialect phonetic Storehouse, and use the speech analysis algorithm corresponding with dialect phonetic storehouse, carry out voice messaging resolving identification.

The present invention has the beneficial effect that:

Determined the region at user place by GPS location, then load the tone decoding storehouse in corresponding area, Carry out speech analysis targetedly, solve in prior art and resolve the degree of accuracy for non-type mandarin The highest problem, it is possible to increase the phonetic recognization rate of different geographical dialect and the accuracy rate of speech analysis, improves Consumer's Experience and the use colony that extends one's service.

Described above is only the general introduction of technical solution of the present invention, in order to better understand the technology of the present invention Means, and can being practiced according to the content of specification, and in order to allow above and other objects of the present invention, Feature and advantage can become apparent, below especially exemplified by the detailed description of the invention of the present invention.

Accompanying drawing explanation

By reading the detailed description of hereafter preferred embodiment, various other advantage and benefit for ability Territory those of ordinary skill will be clear from understanding.Accompanying drawing is only used for illustrating the purpose of preferred embodiment, and also It is not considered as limitation of the present invention.And in whole accompanying drawing, it is denoted by the same reference numerals identical Parts.In the accompanying drawings:

Fig. 1 is the flow chart of the audio recognition method of the embodiment of the present invention；

Fig. 2 is the system framework schematic diagram of the embodiment of the present invention；

Fig. 3 is the detailed process figure of the audio recognition method of the embodiment of the present invention；

Fig. 4 is the structural representation of the speech recognition equipment of the embodiment of the present invention.

Detailed description of the invention

It is more fully described the exemplary embodiment of the disclosure below with reference to accompanying drawings.Although accompanying drawing shows The exemplary embodiment of the disclosure, it being understood, however, that may be realized in various forms the disclosure and should be by Embodiments set forth here is limited.On the contrary, it is provided that these embodiments are able to be best understood from this Open, and complete for the scope of the present disclosure can be conveyed to those skilled in the art.

In order to solve that prior art resolves the problem that the degree of accuracy is the highest, the present invention for non-type mandarin Provide a kind of audio recognition method and device, below in conjunction with accompanying drawing and embodiment, the present invention is entered One step describes in detail.Should be appreciated that specific embodiment described herein only in order to explain the present invention, and Do not limit the present invention.

Embodiment of the method

According to embodiments of the invention, it is provided that a kind of audio recognition method, Fig. 1 is the embodiment of the present invention The flow chart of audio recognition method, as it is shown in figure 1, audio recognition method according to embodiments of the present invention includes Following process:

Step 101, obtains the positional information at user place by client, loads corresponding according to positional information The dialect phonetic storehouse that territorial dialect is corresponding；Preferably, in embodiments of the present invention, it is also possible to pass through client End loads standard mandarin sound bank.

Step 102, is obtained the voice messaging of user's input, calls dialect phonetic storehouse, and make by client With the speech analysis algorithm corresponding with dialect phonetic storehouse, carry out voice messaging resolving identification.

Preferably, in embodiments of the present invention, in order to improve the accuracy rate of identification, if in a step 102 To voice messaging recognition failures, then call standard mandarin sound bank, and use and standard mandarin sound bank Corresponding speech analysis algorithm, carries out voice messaging resolving identifying again.

After carrying out voice messaging resolving identification, it is also possible to identify that the phonetic order obtained is carried out to resolving Analyze, thus obtain Internet of Things intelligent appliance and the action command of control.Action command is sent to accordingly Internet of Things intelligent appliance, is controlled Internet of Things intelligent appliance；Or, action command is sent to Internet of Things The voice platform of net intelligent appliance, is forwarded to corresponding Internet of Things intelligent appliance by voice platform, to Internet of Things Net intelligent appliance is controlled.

Below in conjunction with accompanying drawing, the technique scheme of the embodiment of the present invention is described in detail.

Fig. 2 is the system framework schematic diagram of the embodiment of the present invention, as in figure 2 it is shown, start APP client After, determined the area at user place by GPS location, then load the sound bank of corresponding region.User Inputting phonetic order by microphone, speech analysis module captures the voice of input, calls the voice loaded Storehouse, uses speech analysis algorithm, parses the phonetic order of user, be subsequently sent to intelligent appliance, pass through Domestic electric appliances controller, reaches to control the purpose of household electrical appliances.The embodiment of the present invention is by optimizing speech analysis algorithm, pin Applicability to user group, optimizes the experience of different regions user.

Fig. 3 is the detailed process figure of the audio recognition method of the embodiment of the present invention, as it is shown on figure 3, tool Body includes processing as follows:

Step 1, APP client obtains the natural language of user's input by microphone, and analysis is converted to language Speech information；

Step 2, APP client is passed through voice module, metalanguage information, and matching unit and transmission and is referred to Order；

Step 3, Internet of Things intelligent appliance receives information, and formulates instruction；

Step 4, by domestic electric appliances controller, completes home wiring control.

By above-mentioned process, user can directly use natural language to control internet of things home appliance.

Existing internet of things home appliance Voice command, the discrimination for mandarin is the highest, but with dialect The discrimination of characteristic mandarin is not the most the highest, causes the phonetic function in more existing Mobile solution many Number becomes chicken ribs, limits the scope of use crowd, does not possess universality.The technical side of the embodiment of the present invention Case, by compiling different dialect phonetic storehouses, is optimized speech analysis algorithm, is positioned by GPS, carry The phonetic recognization rate of high different geographical dialect, improves Consumer's Experience and the use colony that extends one's service.

Device embodiment

According to embodiments of the invention, it is provided that a kind of speech recognition equipment, Fig. 4 is the embodiment of the present invention The structural representation of speech recognition equipment, as shown in Figure 4, speech recognition equipment according to embodiments of the present invention Including: load-on module 40 and parsing identification module 42, the below modules to the embodiment of the present invention It is described in detail.

Load-on module 40, for being obtained the positional information at user place by client, is added according to positional information Carry the dialect phonetic storehouse that corresponding territorial dialect is corresponding；Load-on module 40 is further used for: pass through client Load standard mandarin sound bank.

Resolve identification module 42, for being obtained the voice messaging of user's input, called side speech by client Sound storehouse, and use the speech analysis algorithm corresponding with dialect phonetic storehouse, carry out voice messaging resolving identification.

Resolve identification module 42 to be further used for: if to voice messaging recognition failures, then call standard general Call voice storehouse, and use the speech analysis algorithm corresponding with standard mandarin sound bank, again voice is believed Breath carries out resolving identification.

Preferably, speech recognition equipment according to embodiments of the present invention farther includes:

Analyze module, for identifying that the phonetic order obtained is analyzed to resolving, obtain the Internet of Things controlled Intelligent appliance and action command.

Control module, for being sent to corresponding Internet of Things intelligent appliance, to Internet of Things intelligence by action command Household electrical appliances are controlled；Or, action command is sent to the voice platform of Internet of Things intelligent appliance, passes through language Tone level platform is forwarded to corresponding Internet of Things intelligent appliance, is controlled Internet of Things intelligent appliance.

In sum, determined the region at user place by GPS location, then load the language in corresponding area Sound decoding storehouse, carries out speech analysis targetedly, solves in prior art for non-type mandarin Resolve the problem that the degree of accuracy is the highest, it is possible to increase the phonetic recognization rate of different geographical dialect and the standard of speech analysis Really rate, improves Consumer's Experience and the use colony that extends one's service.

Obviously, those skilled in the art can carry out various change and modification without deviating from this to the present invention Bright spirit and scope.So, if the present invention these amendment and modification belong to the claims in the present invention and Within the scope of its equivalent technologies, then the present invention is also intended to comprise these change and modification.

Algorithm and display be not intrinsic with any certain computer, virtual system or miscellaneous equipment provided herein Relevant.Various general-purpose systems can also be used together with based on teaching in this.As described above, structure Make the structure required by this kind of system to be apparent from.Additionally, the present invention is also not for any certain programmed Language.It is understood that, it is possible to use various programming languages realize the content of invention described herein, and The description done language-specific above is the preferred forms in order to disclose the present invention.

In specification mentioned herein, illustrate a large amount of detail.It is to be appreciated, however, that this Bright embodiment can be put into practice in the case of not having these details.In some instances, the most in detail Known method, structure and technology are shown, in order to do not obscure the understanding of this description.

Similarly, it will be appreciated that in order to simplify the disclosure help to understand in each inventive aspect one or Multiple, above in the description of the exemplary embodiment of the present invention, each feature of the present invention is sometimes by one Rise and be grouped in single embodiment, figure or descriptions thereof.But, should be by the method for the disclosure It is construed to reflect an intention that i.e. the present invention for required protection requires than institute in each claim clearly The more feature of feature recorded.More precisely, as the following claims reflect, send out Bright aspect is all features less than single embodiment disclosed above.Therefore, it then follows detailed description of the invention Claims be thus expressly incorporated in this detailed description of the invention, the conduct of the most each claim itself The independent embodiment of the present invention.

Those skilled in the art are appreciated that and can carry out the module in the client in embodiment certainly Change adaptively and they are arranged in one or more clients different from this embodiment.Permissible Block combiner in embodiment is become a module, and multiple submodule or son can be put them in addition Unit or sub-component.Except at least some in such feature and/or process or unit excludes each other it Outward, any combination can be used public in this specification (including adjoint claim, summary and accompanying drawing) All features of opening and the disclosedest any method or all processes of client or unit carry out group Close.Unless expressly stated otherwise, public in this specification (including adjoint claim, summary and accompanying drawing) The each feature opened can be replaced by the alternative features providing identical, equivalent or similar purpose.

Although additionally, it will be appreciated by those of skill in the art that embodiments more described herein include other Some feature included in embodiment rather than further feature, but the combination meaning of the feature of different embodiment Taste and is within the scope of the present invention and is formed different embodiments.Such as, in following claim In book, one of arbitrarily can mode using in any combination of embodiment required for protection.

The all parts embodiment of the present invention can realize with hardware, or with at one or more processor The software module of upper operation realizes, or realizes with combinations thereof.It will be understood by those of skill in the art that Microprocessor or digital signal processor (DSP) can be used in practice to realize according to the present invention real Execute the some or all functions of some or all parts in the client being loaded with sequence network address of example. The present invention be also implemented as part or all the equipment for performing method as described herein or Person's device program (such as, computer program and computer program).Such journey realizing the present invention Sequence can store on a computer-readable medium, or can be to have the form of one or more signal.This The signal of sample can be downloaded from internet website and obtain, or provides on carrier signal, or with any Other forms provide.

The present invention will be described rather than limits the invention to it should be noted above-described embodiment, and And those skilled in the art can design replacement enforcement without departing from the scope of the appended claims Example.In the claims, any reference symbol that should not will be located between bracket is configured to claim Limit.Word " comprises " and does not excludes the presence of the element or step not arranged in the claims.Before being positioned at element Word "a" or "an" do not exclude the presence of multiple such element.If the present invention can be by means of including The hardware of dry different elements and realizing by means of properly programmed computer.If listing equipment for drying In unit claim, several in these devices can be specifically to be embodied by same hardware branch. Word first, second and third use do not indicate that any order.These word explanations can be run after fame Claim.

Claims

1. an audio recognition method, it is characterised in that including:

Obtained the positional information at user place by client, load corresponding region side according to described positional information Say corresponding dialect phonetic storehouse；

Obtained the voice messaging of user's input by client, call described dialect phonetic storehouse, and use and institute State the speech analysis algorithm that dialect phonetic storehouse is corresponding, carry out described voice messaging resolving identification.

2. the method for claim 1, it is characterised in that described method farther includes:

Standard mandarin sound bank is loaded by described client.

3. method as claimed in claim 2, it is characterised in that call described dialect phonetic storehouse, and make With the speech analysis algorithm corresponding with described dialect phonetic storehouse, carry out described voice messaging resolving identification After, described method farther includes:

If to described voice messaging recognition failures, then call described standard mandarin sound bank, and use with The speech analysis algorithm that described standard mandarin sound bank is corresponding, carries out described voice messaging resolving knowing again Not.

4. method as claimed any one in claims 1 to 3, it is characterised in that described voice is believed After breath carries out resolving identification, described method farther includes:

Identify that the phonetic order obtained is analyzed to resolving, obtain the Internet of Things intelligent appliance and action controlled Instruction.

5. method as claimed in claim 4, it is characterised in that obtain the Internet of Things intelligent appliance controlled After action command, described method farther includes:

Described action command is sent to corresponding Internet of Things intelligent appliance, described Internet of Things intelligent appliance is entered Row controls；Or,

Described action command is sent to the voice platform of Internet of Things intelligent appliance, is turned by described voice platform It is dealt into corresponding Internet of Things intelligent appliance, described Internet of Things intelligent appliance is controlled.

6. a speech recognition equipment, it is characterised in that including:

Load-on module, for obtaining the positional information at user place, according to described positional information by client Load the dialect phonetic storehouse that corresponding territorial dialect is corresponding；

Resolve identification module, for being obtained the voice messaging of user's input by client, call described dialect Sound bank, and use the speech analysis algorithm corresponding with described dialect phonetic storehouse, described voice messaging is carried out Resolve and identify.

7. device as claimed in claim 6, it is characterised in that described load-on module is further used for: Standard mandarin sound bank is loaded by described client.

8. device as claimed in claim 7, it is characterised in that described parsing identification module is used further In: if to described voice messaging recognition failures, then call described standard mandarin sound bank, and use with The speech analysis algorithm that described standard mandarin sound bank is corresponding, carries out described voice messaging resolving knowing again Not.

9. the device as according to any one of claim 6 to 8, it is characterised in that described device enters Step includes:

10. device as claimed in claim 9, it is characterised in that described device farther includes:

Control module, for being sent to corresponding Internet of Things intelligent appliance by described action command, to described thing Networking intelligent appliance is controlled；Or, described action command is sent to the voice of Internet of Things intelligent appliance Platform, is forwarded to corresponding Internet of Things intelligent appliance by described voice platform, to described Internet of Things intelligence man Electricity is controlled.