CN108922520A

CN108922520A - Audio recognition method, device, storage medium and electronic equipment

Info

Publication number: CN108922520A
Application number: CN201810764393.1A
Authority: CN
Inventors: 陈岩
Original assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Current assignee: Guangdong Oppo Mobile Telecommunications Corp Ltd
Priority date: 2018-07-12
Filing date: 2018-07-12
Publication date: 2018-11-30
Anticipated expiration: 2038-07-12
Also published as: CN108922520B

Abstract

The embodiment of the present application provides a kind of audio recognition method, device, storage medium and electronic equipment, the audio recognition method：When receiving the voice messaging of user's input, the geographical location that electronic equipment is presently in is obtained；Speech recognition match degree threshold value is obtained according to the corresponding relationship between the geographical location and historical geography position and speech recognition match degree threshold value；The voice messaging is matched with default speech recognition modeling, to obtain speech recognition match degree；When the speech recognition match degree is greater than the speech recognition match degree threshold value, the corresponding operation of instruction in the voice messaging is executed.In the audio recognition method, electronic equipment can dynamically adjust speech recognition match degree threshold value in the use habit of different location according to user, the number of recognition failures can be reduced, the time of electronic equipment consuming when carrying out speech recognition is saved, so as to improve efficiency when electronic equipment carries out speech recognition.

Description

Audio recognition method, device, storage medium and electronic equipment

Technical field

This application involves technical field of voice recognition, in particular to a kind of audio recognition method, device, storage medium and electricity Sub- equipment.

Background technique

With the fast development of electronic technology, the function of the electronic equipments such as smart phone is more and more abundant.For example, with Family can control electronic equipment by voice, to execute the various functions of electronic equipment.

When user carries out voice control to electronic equipment, electronic equipment is identified firstly the need of the voice to user.And The occasion of voice control function is frequently used in user, electronic equipment carries out same speech recognition every time, will lead to voice The efficiency of identification reduces.

Summary of the invention

The embodiment of the present application provides a kind of audio recognition method, device, storage medium and electronic equipment, and electronics can be improved Equipment carries out efficiency when speech recognition.

The embodiment of the present application provides a kind of audio recognition method, including：

When receiving the voice messaging of user's input, the geographical location that electronic equipment is presently in is obtained；

It is obtained according to the corresponding relationship between the geographical location and historical geography position and speech recognition match degree threshold value Take speech recognition match degree threshold value；

The voice messaging is matched with default speech recognition modeling, to obtain speech recognition match degree；

When the speech recognition match degree is greater than the speech recognition match degree threshold value, execute in the voice messaging Instruct corresponding operation.

The embodiment of the present application also provides a kind of speech recognition equipment, including：

First obtains module, for obtaining what electronic equipment was presently in when receiving the voice messaging of user's input Geographical location；

Second obtains module, for according to the geographical location and historical geography position and speech recognition match degree threshold value Between corresponding relationship obtain speech recognition match degree threshold value；

Matching module, for matching the voice messaging with default speech recognition modeling, to obtain speech recognition Matching degree；

Execution module, for executing institute when the speech recognition match degree is greater than the speech recognition match degree threshold value State the corresponding operation of instruction in voice messaging.

The embodiment of the present application also provides a kind of storage medium, computer program is stored in the storage medium, when described When computer program is run on computers, so that the computer executes above-mentioned audio recognition method.

The embodiment of the present application also provides a kind of electronic equipment, including processor and memory, is stored in the memory Computer program, the processor is by calling the computer program stored in the memory, for executing upper predicate Voice recognition method.

The embodiment of the present application also provides a kind of electronic equipment, including processor and the wheat being electrically connected with the processor Gram wind, wherein：

The microphone, for receiving the voice messaging of user's input；

The processor, is used for：

Obtain the geographical location that electronic equipment is presently in；

Audio recognition method provided by the embodiments of the present application, including：When receiving the voice messaging of user's input, electricity is obtained The geographical location that sub- equipment is presently in；According to the geographical location and historical geography position and speech recognition match degree threshold value Between corresponding relationship obtain speech recognition match degree threshold value；By the voice messaging and the progress of default speech recognition modeling Match, to obtain speech recognition match degree；When the speech recognition match degree is greater than the speech recognition match degree threshold value, execute The corresponding operation of instruction in the voice messaging.In the audio recognition method, electronic equipment can be according to user in difference The use habit in place dynamically adjusts speech recognition match degree threshold value, it is possible to reduce the number of recognition failures is saved electronics and set The time of the standby consuming when carrying out speech recognition, so as to improve efficiency when electronic equipment carries out speech recognition.

Detailed description of the invention

In order to more clearly explain the technical solutions in the embodiments of the present application, make required in being described below to embodiment Attached drawing is briefly described.It should be evident that the drawings in the following description are only some examples of the present application, for For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.

Fig. 1 is the schematic diagram that user carries out voice control to electronic equipment.

Fig. 2 is the first flow diagram of audio recognition method provided by the embodiments of the present application.

Fig. 3 is second of flow diagram of audio recognition method provided by the embodiments of the present application.

Fig. 4 is the third flow diagram of audio recognition method provided by the embodiments of the present application.

Fig. 5 is the 4th kind of flow diagram of audio recognition method provided by the embodiments of the present application.

Fig. 6 is the 5th kind of flow diagram of audio recognition method provided by the embodiments of the present application.

Fig. 7 is the structural schematic diagram of speech recognition equipment provided by the embodiments of the present application.

Fig. 8 is another structural schematic diagram of speech recognition equipment provided by the embodiments of the present application.

Fig. 9 is the structural schematic diagram of electronic equipment provided by the embodiments of the present application.

Figure 10 is another structural schematic diagram of electronic equipment provided by the embodiments of the present application.

Specific embodiment

Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete Site preparation description.Obviously, described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on Embodiment in the application, those skilled in the art's every other implementation obtained under that premise of not paying creative labor Example, belongs to the protection scope of the application.

The description and claims of this application and term " first " in above-mentioned attached drawing, " second ", " third " etc. (if present) is to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be appreciated that this The object of sample description is interchangeable under appropriate circumstances.In addition, term " includes " and " having " and their any deformation, meaning Figure, which is to cover, non-exclusive includes.For example, containing the process, method of series of steps or containing a series of modules or list The device of member, electronic equipment, system those of are not necessarily limited to be clearly listed step or module or unit, can also include not having The step of being clearly listed or module or unit also may include for these process, methods, device, electronic equipment or system Intrinsic other steps or module or unit.

It is the schematic diagram that user carries out voice control to electronic equipment with reference to Fig. 1, Fig. 1.Wherein, user exports one section of language Sound, electronic equipment acquire the voice messaging of user.Then, electronic equipment will store in collected voice messaging and electronic equipment Speech recognition modeling be compared.When voice messaging and speech recognition modeling coincide, electronic equipment is known from voice messaging It Chu not control instruction.Then, electronic equipment executes operation corresponding with the control instruction, such as bright screen, unlatching are applied, exited Using the operation such as, screen locking, to realize user to the voice control of electronic equipment.

The embodiment of the present application provides a kind of audio recognition method, and the audio recognition method can be applied to electronic equipment In.The electronic equipment can be smart phone, tablet computer, game station, AR, and (Augmented Reality, enhancing are existing It is real) equipment, data storage device, audio playing apparatus, video play device, laptop, Desktop computing device etc..

As shown in Fig. 2, the audio recognition method, may comprise steps of：

110, when receiving the voice messaging of user's input, obtain the geographical location that electronic equipment is presently in.

Positioning system is provided in electronic equipment.For example, electronic equipment may include GPS (GlobalPositioning System, global positioning system), BDS (BeiDou Navigation Satellite System, Beidou satellite navigation system) Etc. positioning systems.

After electronic equipment opens speech identifying function, electronic equipment can be with the voice messaging of continuous collecting user.For example, electric Microphone can be set in sub- equipment, electronic equipment can acquire the voice messaging of user's input by microphone.Wherein, it uses The voice messaging at family is one section of sentence that user inputs to electronic equipment.The voice messaging is used to carry out voice to electronic equipment Control.It may include one or more instructions, such as the instruction such as " screen locking ", " increasing volume " in the voice messaging.

When electronic equipment receives the voice messaging of user's input, electronic equipment can obtain electricity by positioning system The geographical location that sub- equipment is presently in.Wherein, the geographical location that electronic equipment obtains may include the seat of current geographic position Mark information or the area information of current geographic position etc..The coordinate information in geographical location for example may include current geographic position The information such as longitude, latitude.The area information in geographical location for example may include street locating for current geographic position, cell, The information such as supermarket, subway station.

120, according to the corresponding pass between the geographical location and historical geography position and speech recognition match degree threshold value System obtains speech recognition match degree threshold value.

Wherein, it can be preset in electronic equipment corresponding between historical geography position and speech recognition match degree threshold value Relationship.The historical geography position includes that electronic equipment had carried out geographical location locating when speech recognition.The correspondence In relationship, the corresponding speech recognition match degree threshold value in different historical geography positions be can be different.Therefore, work as electronic equipment When in different geographical locations, corresponding speech recognition match degree threshold value is also possible to different.

The speech recognition match degree threshold value indicates voice messaging and speech recognition modeling successful match or it fails to match Between line of demarcation.When the matching degree between speech recognition modeling pre-set in voice messaging and electronic equipment is greater than voice When identifying matching degree threshold value, the voice messaging and the speech recognition modeling successful match are indicated.When voice messaging and electronics When matching degree in equipment between pre-set speech recognition modeling is less than or equal to speech recognition match degree threshold value, institute is indicated Stating voice messaging, it fails to match with the speech recognition modeling.

It, can be according to the geographical location and historical geography position after electronic equipment gets the geographical location being presently in The corresponding relationship set between speech recognition match degree threshold value obtains speech recognition match degree threshold value.

130, the voice messaging is matched with default speech recognition modeling, to obtain speech recognition match degree.

Electronic equipment can carry out speech recognition modeling pre-set in the voice messaging received and electronic equipment Matching, to obtain the matching degree between the voice messaging and the speech recognition modeling.Wherein, matching degree indicates the voice Similarity degree or degree of agreement between information and the speech recognition modeling.

Wherein, when the default speech recognition modeling can enable the speech identifying function of electronic equipment for user for the first time, Electronic equipment acquires the training voice messaging of user, and the speech recognition modeling generated according to the trained voice messaging.

140, when the speech recognition match degree is greater than the speech recognition match degree threshold value, execute the voice messaging In the corresponding operation of instruction.

It, can be by institute after electronic equipment obtains the matching degree between the voice messaging and the default speech recognition modeling It states matching degree to be compared with the speech recognition match degree threshold value, to judge the matching degree and the speech recognition match degree Size relation between threshold value.

When the matching degree is greater than the speech recognition match degree threshold value, the voice messaging and the default language are indicated Sound identification model successful match.Then, electronic equipment can be further analyzed the voice messaging, the predicate to obtain The control instruction for including in message breath, and operation corresponding with described instruction is executed, such as controlling electronic devices screen locking, control electricity Sub- equipment increases volume etc..

In the embodiment of the present application, when electronic equipment is in different geographical locations, the speech recognition match degree threshold that gets Value can be different.Therefore, electronic equipment, can be according to user or not different geographical locations carries out speech recognition Speech recognition match degree threshold value is dynamically adjusted with the use habit in place, it is possible to reduce the number of recognition failures saves electronics The time of equipment consuming when carrying out speech recognition, so as to improve efficiency when electronic equipment carries out speech recognition.

In some embodiments, as shown in figure 3, step 110, receive user input voice messaging when, obtain electronics It is further comprising the steps of before the geographical location that equipment is presently in：

151, multiple speech recognition historical datas of electronic equipment are obtained, each speech recognition historical data includes institute It states electronic equipment and carries out historical geography position locating when speech recognition；

152, clustering is carried out to multiple speech recognition historical datas, to obtain the electronic equipment in each institute State the number that historical geography position carries out speech recognition；

153, history is generated in the number that each historical geography position carries out speech recognition according to the electronic equipment Corresponding relationship between geographical location and speech recognition match degree threshold value.

Electronic equipment can record locating geographical location when carrying out speech recognition each time, to form voice knowledge Other historical data, and in the electronic device by the speech recognition history data store.Wherein, the speech recognition historical data In geographical location be historical geography position.

The available multiple speech recognition historical datas of electronic equipment.Wherein, each speech recognition historical data packet It includes the electronic equipment and carries out historical geography position locating when speech recognition.For example, the speech recognition that electronic equipment is got Historical data may include 100 historical datas.

Then, electronic equipment can carry out clustering to multiple speech recognition historical datas, to obtain the electricity Sub- equipment carries out the number of speech recognition in each historical geography position.For example, the result that clustering obtains can wrap It includes：A corresponding speech recognition number in historical geography position is 20 times, and B corresponding speech recognition number in historical geography position is 30 Secondary, C corresponding speech recognition number in historical geography position is 10 times, and D corresponding speech recognition number in historical geography position is 40 It is secondary.

Then, electronic equipment carries out the number of speech recognition in each historical geography position according to the electronic equipment Generate the corresponding relationship between historical geography position and speech recognition match degree threshold value.For example, electronic equipment is in some history The number for managing position progress speech recognition is more, and corresponding speech recognition match degree threshold value is lower.

For example, the corresponding relationship between the historical geography position that electronic equipment generates and speech recognition match degree threshold value can be with For corresponding relationship as shown in Table 1：

Table 1

Historical geography position	Speech recognition number	Speech recognition match degree threshold value
			Historical geography position D	40	80%
……	……	……
			Historical geography position B	30	85%
……	……	……
			Historical geography position A	20	90%
……	……	……
			Historical geography position C	10	95%
……	……	……

In some embodiments, as shown in figure 4, step 153, according to the electronic equipment in each historical geography position It sets and carries out the number of speech recognition and generate corresponding relationship between historical geography position and speech recognition match degree threshold value, including with Lower step：

1531, multiple preset times sections are set；

1532, it is determined according to the electronic equipment in the number that each historical geography position carries out speech recognition each The corresponding multiple historical geography positions of number included by the preset times section；

1533, multiple historical geography positions corresponding to number included by each preset times section are arranged one Speech recognition match degree threshold value.

Electronic equipment may all carry out speech recognition at many historical geography positions, and in each historical geography The number that speech recognition is carried out at position may be different from, the historical geography position generated so as to cause electronic equipment and voice Identify that the corresponding relationship between matching degree threshold value is excessively complicated.

Therefore, it can be set for electronic equipment in the number that each historical geography position carries out speech recognition in electronic equipment Set multiple preset times sections.For example, can be set in electronic equipment (0,15], (15,25], (25,35], (35,45] etc. it is more A preset times section.

Electronic equipment can determine each described default according to the number for carrying out speech recognition in each historical geography position The corresponding multiple historical geography positions of number included by time intervals.For example, preset times section (15,25] include 16,18, 20, numbers such as 22, and the corresponding historical geography position of number 16,18,20,22 is respectively A1, A2, A, A3, then preset times area Between (15,25] corresponding multiple historical geography positions include historical geography position A1, A2, A, A3 etc..

Electronic equipment can be to the corresponding multiple historical geography positions of number included by each preset times section One speech recognition match degree threshold value is set.To which historical geography position similar in multiple numbers can correspond to the same voice Matching degree threshold value is identified, to simplify the corresponding relationship between historical geography position and speech recognition match degree threshold value.

For example, the corresponding relationship between the historical geography position and speech recognition match degree threshold value can be such as 2 institute of table The corresponding relationship shown：

Table 2

In some embodiments, as shown in figure 5, step 152, to multiple speech recognition historical datas carry out cluster point Analysis carries out the number of speech recognition to obtain the electronic equipment in each historical geography position, includes the following steps：

1521, clustering is carried out to multiple speech recognition historical datas, to obtain the electronic equipment each Locating each historical juncture carries out the number of speech recognition when the historical geography position；

Step 153 is generated according to the electronic equipment in the number that each historical geography position carries out speech recognition Corresponding relationship between historical geography position and speech recognition match degree threshold value, includes the following steps：

1534, language is carried out according to the electronic equipment each historical juncture locating at each historical geography position The number of sound identification generates the corresponding relationship between historical geography position, historical juncture and speech recognition match degree threshold value.

Wherein, each speech recognition historical data that electronic equipment obtains further includes that the electronic equipment carries out voice Locating historical juncture when identification.Electronic equipment carry out when speech recognition locating historical juncture can by it is daily at the time of into Row indicates.

Electronic equipment can carry out clustering to multiple speech recognition historical datas, to obtain the electronic equipment At each historical geography position, locating each historical juncture carries out the number of speech recognition.For example, clustering obtains To result may include：The corresponding speech recognition number of historical geography position A, moment T1 be 5 times, historical geography position A, when Carving the corresponding speech recognition number of T2 is 8 times, and the corresponding speech recognition number of historical geography position B, moment T1 is 3 times, history The corresponding speech recognition number of geographical location B, moment T2 is 10 times, etc..

Electronic equipment can carry out voice knowledge according to each historical juncture locating at each historical geography position Other number generates the corresponding relationship between historical geography position, historical juncture and speech recognition match degree threshold value.

For example, pair between historical geography position, historical juncture and speech recognition match degree threshold value that electronic equipment generates It should be related to can be corresponding relationship as shown in table 3

Table 3

To which speech recognition match degree threshold value can be corresponding with historical geography position and historical juncture simultaneously, also can Speech recognition match degree threshold value is adjusted according to geographical location and moment simultaneously, so that electronic equipment is carrying out speech recognition When can more pointedly dynamic adjustment speech recognition match degree threshold value, can either guarantee the accuracy of speech recognition while again The efficiency of speech recognition can be improved.

In some embodiments, as shown in fig. 6, step 110, receive user input voice messaging when, obtain electronics It is further comprising the steps of after the geographical location that equipment is presently in：

161, at the time of acquisition is presently in；

Step 120, according to pair between the geographical location and historical geography position and speech recognition match degree threshold value Relation acquisition speech recognition match degree threshold value is answered, is included the following steps：

121, according to the geographical location, the moment and historical geography position, historical juncture and speech recognition match The corresponding relationship spent between threshold value obtains speech recognition match degree threshold value.

Electronic equipment can be obtained further at the time of be presently in after getting the geographical location being presently in.Its In, it can be 16 at the time of indicating, such as be presently in by the current time at the time of being presently in:00.

Then, electronic equipment is according to the geographical location, the moment and historical geography position, historical juncture and voice Identify that the corresponding relationship between matching degree threshold value obtains speech recognition match degree threshold value.

For example, the geographical location being presently in is geographical location B, it is 16 at the time of being presently in:00, then electronic equipment can To get speech recognition match degree threshold value according to the corresponding relationship as 88%.

When it is implemented, the application is not limited by the execution sequence of described each step, conflict is not being generated In the case of, certain steps can also be carried out using other sequences or be carried out simultaneously.

From the foregoing, it will be observed that audio recognition method provided by the embodiments of the present application, including：Receive the voice messaging of user's input When, obtain the geographical location that electronic equipment is presently in；According to the geographical location and historical geography position and speech recognition Corresponding relationship between matching degree threshold value obtains speech recognition match degree threshold value；By the voice messaging and default speech recognition mould Type is matched, to obtain speech recognition match degree；When the speech recognition match degree is greater than the speech recognition match degree threshold When value, the corresponding operation of instruction in the voice messaging is executed.In the audio recognition method, electronic equipment can according to Family dynamically adjusts speech recognition match degree threshold value in the use habit of different location, it is possible to reduce the number of recognition failures, section The time of electronic equipment consuming when carrying out speech recognition is saved, so as to improve effect when electronic equipment carries out speech recognition Rate.

The embodiment of the present application also provides a kind of speech recognition equipment, and the speech recognition equipment can integrate in electronic equipment In.The electronic equipment can be smart phone, tablet computer, game station, AR, and (Augmented Reality, enhancing are existing It is real) equipment, data storage device, audio playing apparatus, video play device, laptop, Desktop computing device etc..

As shown in fig. 7, speech recognition equipment 200 may include：First obtain module 201, second obtain module 202, With module 203, execution module 204.

First obtains module 201, for obtaining electronic equipment and being presently in when receiving the voice messaging of user's input Geographical location.

When electronic equipment receives the voice messaging of user's input, the first acquisition module 201 can pass through positioning system To obtain the geographical location that electronic equipment is presently in.Wherein, the first geographical location for obtaining the acquisition of module 201 may include working as The coordinate information or the area information of current geographic position in preceding geographical location etc..The coordinate information in geographical location for example can wrap Include the information such as longitude, the latitude of current geographic position.The area information in geographical location for example may include current geographic position institute The information such as street, cell, supermarket, the subway station at place.

Second obtains module 202, for according to the geographical location and historical geography position and speech recognition match degree Corresponding relationship between threshold value obtains speech recognition match degree threshold value.

After first acquisition module 201 gets the geographical location being presently in, second obtains module 202 can be according to described Corresponding relationship between geographical location and historical geography position and speech recognition match degree threshold value obtains speech recognition match degree Threshold value.

Matching module 203, for matching the voice messaging with default speech recognition modeling, to obtain voice knowledge Other matching degree.

Matching module 203 can by speech recognition modeling pre-set in the voice messaging received and electronic equipment into Row matching, to obtain the matching degree between the voice messaging and the speech recognition modeling.Wherein, matching degree indicates institute's predicate Similarity degree or degree of agreement between message breath and the speech recognition modeling.

Execution module 204, for executing when the speech recognition match degree is greater than the speech recognition match degree threshold value The corresponding operation of instruction in the voice messaging.

After matching module 203 obtains the matching degree between the voice messaging and the default speech recognition modeling, execute The matching degree can be compared by module 204 with the speech recognition match degree threshold value, to judge the matching degree and institute State the size relation between speech recognition matching degree threshold value.

When the matching degree is greater than the speech recognition match degree threshold value, the voice messaging and the default language are indicated Sound identification model successful match.Then, execution module 204 can be further analyzed the voice messaging, to obtain The control instruction for including in voice messaging is stated, and executes operation corresponding with described instruction, such as controlling electronic devices screen locking, control Electronic equipment processed increases volume etc..

In some embodiments, as shown in figure 8, speech recognition equipment 200 further includes generation module 205.The generation mould Block 205 is for executing following steps：

Multiple speech recognition historical datas of electronic equipment are obtained, each speech recognition historical data includes the electricity Sub- equipment carries out historical geography position locating when speech recognition；

Clusterings are carried out to multiple speech recognition historical datas, described are gone through with obtaining the electronic equipment each The number of history geographical location progress speech recognition；

Historical geography is generated in the number that each historical geography position carries out speech recognition according to the electronic equipment Corresponding relationship between position and speech recognition match degree threshold value.

The available multiple speech recognition historical datas of generation module 205.Wherein, each speech recognition historical data Locating historical geography position when including electronic equipment progress speech recognition.For example, the speech recognition history number got According to may include 100 historical datas.

Then, generation module 205 can carry out clustering to multiple speech recognition historical datas, described to obtain Electronic equipment carries out the number of speech recognition in each historical geography position.For example, the obtained result of clustering can be with Including：A corresponding speech recognition number in historical geography position is 20 times, and B corresponding speech recognition number in historical geography position is 30 times, C corresponding speech recognition number in historical geography position is 10 times, and D corresponding speech recognition number in historical geography position is 40 times.

Then, generation module 205 carries out speech recognition in each historical geography position according to the electronic equipment Number generates the corresponding relationship between historical geography position and speech recognition match degree threshold value.For example, electronic equipment is gone through at some The number that history geographical location carries out speech recognition is more, and corresponding speech recognition match degree threshold value is lower.

For example, the corresponding relationship between the historical geography position that generation module 205 generates and speech recognition match degree threshold value It can be corresponding relationship as shown in table 4：

Table 4

In some embodiments, time of speech recognition is carried out in each historical geography position according to the electronic equipment When number generates the corresponding relationship between historical geography positions and speech recognition match degree threshold value, generation module 205 for execute with Lower step：

Multiple preset times sections are set；

It is determined according to the electronic equipment in the number that each historical geography position carries out speech recognition each described The corresponding multiple historical geography positions of number included by preset times section；

A voice is arranged in multiple historical geography positions corresponding to number included by each preset times section Identify matching degree threshold value.

Therefore, generation module 205 can carry out the number of speech recognition in each historical geography position for electronic equipment Multiple preset times sections are set in the electronic device.For example, can be set in electronic equipment (0,15], (15,25], (25, 35], (35,45] etc. multiple preset times sections.

Generation module 205 can determine each described pre- according to the number for carrying out speech recognition in each historical geography position If the corresponding multiple historical geography positions of number included by time intervals.For example, preset times section (15,25] include 16, 18, numbers such as 20,22, and the corresponding historical geography position of number 16,18,20,22 is respectively A1, A2, A, A3, then preset times Section (15,25] corresponding multiple historical geography positions include historical geography position A1, A2, A, A3 etc..

Generation module 205 can be to the corresponding multiple historical geography positions of number included by each preset times section Install a speech recognition match degree threshold value.To which historical geography position similar in multiple numbers can correspond to the same language Sound identifies matching degree threshold value, to simplify the corresponding relationship between historical geography position and speech recognition match degree threshold value.

For example, the corresponding relationship between the historical geography position and speech recognition match degree threshold value can be such as 5 institute of table The corresponding relationship shown：

Table 5

In some embodiments, clustering is carried out to multiple speech recognition historical datas, to obtain the electronics For equipment when each historical geography position carries out the number of speech recognition, generation module 205 is for executing following steps：

Clusterings are carried out to multiple speech recognition historical datas, described are gone through with obtaining the electronic equipment each Locating each historical juncture carries out the number of speech recognition when history geographical location；

Historical geography is generated in the number that each historical geography position carries out speech recognition according to the electronic equipment When corresponding relationship between position and speech recognition match degree threshold value, generation module 205 is for executing following steps：

Voice knowledge is carried out according to the electronic equipment each historical juncture locating at each historical geography position Other number generates the corresponding relationship between historical geography position, historical juncture and speech recognition match degree threshold value.

Wherein, each speech recognition historical data that generation module 205 obtains further includes that the electronic equipment carries out Locating historical juncture when speech recognition.Electronic equipment carry out locating historical juncture when speech recognition can by it is daily when Quarter is indicated.

Generation module 205 can carry out clustering to multiple speech recognition historical datas, to obtain the electronics The equipment each historical juncture locating at each historical geography position carries out the number of speech recognition.For example, cluster point Analysing obtained result may include：The corresponding speech recognition number of historical geography position A, moment T1 is 5 times, historical geography position A, the corresponding speech recognition number of moment T2 is 8 times, and the corresponding speech recognition number of historical geography position B, moment T1 is 3 times, The corresponding speech recognition number of historical geography position B, moment T2 is 10 times, etc..

Generation module 205 can carry out voice according to each historical juncture locating at each historical geography position The number of identification generates the corresponding relationship between historical geography position, historical juncture and speech recognition match degree threshold value.

For example, between historical geography position, historical juncture and speech recognition match degree threshold value that generation module 205 generates Corresponding relationship can be corresponding relationship as shown in table 6

Table 6

In some embodiments, the first acquisition module 201 is also used to execute following steps：

At the time of acquisition is presently in；

It is obtained according to the corresponding relationship between the geographical location and historical geography position and speech recognition match degree threshold value When taking speech recognition match degree threshold value, the second acquisition module 202 is for executing following steps：

According to the geographical location, the moment and historical geography position, historical juncture and speech recognition match degree threshold Corresponding relationship between value obtains speech recognition match degree threshold value.

First obtains module 201 after getting the geographical location being presently in, and can further obtain and to be presently in Moment.It wherein, can be 16 at the time of indicating, such as be presently in by the current time at the time of being presently in:00.

Then, when the second acquisition module 202 is according to the geographical location, the moment and historical geography position, history The corresponding relationship carved between speech recognition match degree threshold value obtains speech recognition match degree threshold value.

For example, the geographical location being presently in is geographical location B, it is 16 at the time of being presently in:00, then second obtain mould It is 88% that block 202 can get speech recognition match degree threshold value according to the corresponding relationship.

When it is implemented, the above modules can be used as independent entity to realize, any combination can also be carried out, is made It is realized for same or several entities.

From the foregoing, it will be observed that in speech recognition equipment 200 provided by the embodiments of the present application, in the voice letter for receiving user's input When breath, the first acquisition module 201 obtains the geographical location that electronic equipment is presently in；Second obtains module 202 according to describedly The corresponding relationship managed between position and historical geography position and speech recognition match degree threshold value obtains speech recognition match degree threshold Value；Matching module 203 matches the voice messaging with default speech recognition modeling, to obtain speech recognition match degree； Execution module 204 executes the voice messaging when the speech recognition match degree is greater than the speech recognition match degree threshold value In the corresponding operation of instruction.The speech recognition equipment can be practised in different location using the use of electronic equipment according to user It is used to dynamically to adjust speech recognition match degree threshold value, it is possible to reduce the number of recognition failures saves electronic equipment and carrying out voice The time expended when identification, so as to improve efficiency when electronic equipment carries out speech recognition.

The embodiment of the present application also provides a kind of electronic equipment.The electronic equipment can be smart phone, tablet computer etc. Equipment.As shown in figure 9, electronic equipment 300 includes processor 301 and memory 302.Wherein, processor 301 and memory 302 It is electrically connected.

Processor 301 is the control centre of electronic equipment 300, utilizes various interfaces and the entire electronic equipment of connection Various pieces, by running or calling the computer program being stored in memory 302, and calling to be stored in memory 302 Interior data execute the various functions and processing data of electronic equipment, to carry out integral monitoring to electronic equipment.

In the present embodiment, processor 301 in electronic equipment 300 can according to following step, by one or one with On the corresponding instruction of process of computer program be loaded into memory 302, and run by processor 301 and be stored in storage Computer program in device 302, to realize various functions：

In some embodiments, when receiving the voice messaging of user's input, the geography that electronic equipment is presently in is obtained Before position, processor 301 also executes following steps：

In some embodiments, time of speech recognition is carried out in each historical geography position according to the electronic equipment When number generates the corresponding relationship between historical geography position and speech recognition match degree threshold value, processor 301 executes following steps：

Multiple preset times sections are set；

In some embodiments, each speech recognition historical data further includes that the electronic equipment carries out speech recognition When locating historical juncture；

Clusterings are carried out to multiple speech recognition historical datas, described are gone through with obtaining the electronic equipment each When history geographical location carries out the number of speech recognition, processor 301 executes following steps：

Historical geography is generated in the number that each historical geography position carries out speech recognition according to the electronic equipment When corresponding relationship between position and speech recognition match degree threshold value, processor 301 executes following steps：

In some embodiments, when receiving the voice messaging of user's input, the geography that electronic equipment is presently in is obtained After position, processor 301 also executes following steps：

At the time of acquisition is presently in；

It is obtained according to the corresponding relationship between the geographical location and historical geography position and speech recognition match degree threshold value When taking speech recognition match degree threshold value, processor 301 executes following steps：

Memory 302 can be used for storing computer program and data.Include in the computer program that memory 302 stores The instruction that can be executed in the processor.Computer program can form various functional modules.Processor 301 is stored in by calling The computer program of memory 302, thereby executing various function application and data processing.

In some embodiments, as shown in Figure 10, electronic equipment 300 further includes：Radio circuit 303, display screen 304, control Circuit 305, input unit 306, voicefrequency circuit 307, sensor 308 and power supply 309 processed.Wherein, processor 301 respectively with penetrate Frequency circuit 303, display screen 304, control circuit 305, input unit 306, voicefrequency circuit 307, sensor 308 and power supply 309 It is electrically connected.

Radio circuit 303 is used for transceiving radio frequency signal, with by wireless communication with the network equipment or other electronic equipments into Row communication.

Display screen 304 can be used for showing information input by user or be supplied to user information and electronic equipment it is each Kind graphical user interface, these graphical user interface can be made of image, text, icon, video and any combination thereof.

Control circuit 305 and display screen 304 are electrically connected, and show information for controlling display screen 304.

Input unit 306 can be used for receiving number, character information or the user's characteristic information (such as fingerprint) of input, and Generate keyboard related with user setting and function control, mouse, operating stick, optics or trackball signal input.Wherein, Input unit 306 may include fingerprint recognition mould group.

Voicefrequency circuit 307 can provide the audio interface between user and electronic equipment by loudspeaker, microphone.Wherein, Voicefrequency circuit 307 includes microphone.The microphone and the processor 301 are electrically connected.The microphone is used for receiving The voice messaging of family input.

Sensor 308 is for acquiring external environmental information.Sensor 308 may include ambient light sensor, acceleration One of sensors such as sensor, gyroscope are a variety of.

All parts of the power supply 309 for electron equipment 300 are powered.In some embodiments, power supply 309 can pass through Power-supply management system and processor 301 are logically contiguous, to realize management charging, electric discharge, Yi Jigong by power-supply management system The functions such as consumption management.

Although being not shown in Figure 10, electronic equipment 300 can also include camera, bluetooth module etc., and details are not described herein.

From the foregoing, it will be observed that the embodiment of the present application provides a kind of electronic equipment, the electronic equipment executes following steps：It receives When the voice messaging inputted to user, the geographical location that is presently in of electronic equipment is obtained；According to the geographical location and go through Corresponding relationship between history geographical location and speech recognition match degree threshold value obtains speech recognition match degree threshold value；By the voice Information is matched with default speech recognition modeling, to obtain speech recognition match degree；When the speech recognition match degree is greater than When the speech recognition match degree threshold value, the corresponding operation of instruction in the voice messaging is executed.The electronic equipment can be with Speech recognition match degree threshold value is dynamically adjusted in the use habit of different location according to user, it is possible to reduce time of recognition failures Number saves the time of electronic equipment consuming when carrying out speech recognition, when carrying out speech recognition so as to improve electronic equipment Efficiency.

The embodiment of the present application also provides a kind of storage medium, computer program is stored in the storage medium, when described When computer program is run on computers, the computer executes audio recognition method described in any of the above-described embodiment.

It should be noted that those of ordinary skill in the art will appreciate that whole in the various methods of above-described embodiment or Part steps are relevant hardware can be instructed to complete by computer program, and the computer program can store in meter In calculation machine readable storage medium storing program for executing, the storage medium be can include but is not limited to：Read-only memory (ROM, Read Only Memory), random access memory (RAM, Random Access Memory), disk or CD etc..

Audio recognition method, device provided by the embodiment of the present application, storage medium and electronic equipment are carried out above It is discussed in detail.Specific examples are used herein to illustrate the principle and implementation manner of the present application, above embodiments Illustrate to be merely used to help understand the present processes and its core concept；Meanwhile for those skilled in the art, according to this The thought of application, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification is not answered It is interpreted as the limitation to the application.

Claims

1. a kind of audio recognition method, which is characterized in that including：

Language is obtained according to the corresponding relationship between the geographical location and historical geography position and speech recognition match degree threshold value Sound identifies matching degree threshold value；

When the speech recognition match degree is greater than the speech recognition match degree threshold value, the instruction in the voice messaging is executed Corresponding operation.

2. audio recognition method according to claim 1, which is characterized in that when receiving the voice messaging of user's input, Before obtaining the geographical location that electronic equipment is presently in, further include：

Multiple speech recognition historical datas of electronic equipment are obtained, each speech recognition historical data includes that the electronics is set It is standby to carry out historical geography position locating when speech recognition；

Clustering is carried out to multiple speech recognition historical datas, to obtain the electronic equipment in each history Manage the number that position carries out speech recognition；

Historical geography position is generated in the number that each historical geography position carries out speech recognition according to the electronic equipment With the corresponding relationship between speech recognition match degree threshold value.

3. audio recognition method according to claim 2, which is characterized in that it is described according to the electronic equipment in each institute State pair between the number generation historical geography position and speech recognition match degree threshold value of historical geography position progress speech recognition The step of should being related to includes：

Multiple preset times sections are set；

It is determined according to the electronic equipment in the number that each historical geography position carries out speech recognition each described default The corresponding multiple historical geography positions of number included by time intervals；

A speech recognition is arranged in multiple historical geography positions corresponding to number included by each preset times section Matching degree threshold value.

4. audio recognition method according to claim 2, which is characterized in that each speech recognition historical data is also wrapped It includes the electronic equipment and carries out the historical juncture locating when speech recognition；

Clustering is carried out to multiple speech recognition historical datas, to obtain the electronic equipment in each history The number that position carries out speech recognition is managed, including：

Clustering is carried out to multiple speech recognition historical datas, to obtain the electronic equipment in each history Manage the number of progress speech recognition of each historical juncture locating when position；

Historical geography position is generated in the number that each historical geography position carries out speech recognition according to the electronic equipment With the corresponding relationship between speech recognition match degree threshold value, including：

Speech recognition is carried out according to the electronic equipment each historical juncture locating at each historical geography position Number generates the corresponding relationship between historical geography position, historical juncture and speech recognition match degree threshold value.

5. audio recognition method according to claim 4, which is characterized in that when receiving the voice messaging of user's input, After obtaining the geographical location that electronic equipment is presently in, further include：

At the time of acquisition is presently in；

The corresponding relationship according between the geographical location and historical geography position and speech recognition match degree threshold value obtains The step of taking speech recognition match degree threshold value include：

According to the geographical location, the moment and historical geography position, historical juncture and speech recognition match degree threshold value it Between corresponding relationship obtain speech recognition match degree threshold value.

6. a kind of speech recognition equipment, which is characterized in that including：

First obtains module, for obtaining the geography that electronic equipment is presently in when receiving the voice messaging of user's input Position；

Second obtains module, for according between the geographical location and historical geography position and speech recognition match degree threshold value Corresponding relationship obtain speech recognition match degree threshold value；

Matching module, for matching the voice messaging with default speech recognition modeling, to obtain speech recognition match Degree；

Execution module, for executing institute's predicate when the speech recognition match degree is greater than the speech recognition match degree threshold value The corresponding operation of instruction in message breath.

7. speech recognition equipment according to claim 6, which is characterized in that it further include generation module, the generation module For：

8. a kind of storage medium, which is characterized in that computer program is stored in the storage medium, when the computer program When running on computers, so that the computer perform claim requires 1 to 5 described in any item audio recognition methods.

9. a kind of electronic equipment, which is characterized in that the electronic equipment includes processor and memory, is stored in the memory There is computer program, the processor is used for perform claim by calling the computer program stored in the memory It is required that 1 to 5 described in any item audio recognition methods.

10. a kind of electronic equipment, which is characterized in that the electronic equipment includes processor and is electrically connected with the processor Microphone, wherein：

The microphone, for receiving the voice messaging of user's input；

The processor, is used for：

Obtain the geographical location that electronic equipment is presently in；