Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Embodiment based in the present invention, those of ordinary skills, not making the every other embodiment obtaining under creative work prerequisite, belong to the scope of protection of the invention.
In order to solve the lower problem of Chinese speech navigation service system speech recognition success ratio of prior art, the embodiment of the present invention provides a kind of air navigation aid and system based on natural-sounding identification.
As shown in Figure 1, the navigational system based on natural-sounding identification that the embodiment of the present invention provides, comprising: one-key type control device 101, terminal device 102 and cloud computing platform server 103;
One-key type control device 101, be arranged on the fixed part of vehicle, for after user presses start key, by direct or short haul connection mode and terminal device 102, connect, and drive terminal device 102 and cloud computing platform server 103 to connect by direct or short haul connection mode;
Terminal device 102, for after connecting with one-key type control device 101, by voice call exchange network or multiple radio data network and cloud computing platform server 103, connect, receive the navigation position voice messaging that user sends, navigation position voice messaging is sent to cloud computing platform server 103, receive the self-navigation control information that comprises navigation purpose address that cloud computing platform server 103 returns, according to this self-navigation control information, start navigation feature, connect with navigation server, from navigation server, obtain the navigation results of described navigation purpose address, this navigation results is shown to user,
Cloud computing platform server 103, is positioned at network side, comprising:
Unspecified person sound identification module 1031, identifies, resolves for the navigation position voice messaging that terminal device 102 is sent, and obtains the Word message that this navigation position voice messaging is corresponding;
Natural-sounding identification module 1032, for adopting the Word message that the dictionary that sets in advance obtains unspecified person sound identification module 1031 to carry out word segmentation processing, obtain the word that this word packets of information contains, point of interest (Point of Interest searched in the word comprising according to Word message, POI) database, obtain the highest target P OI information of word match degree comprising with Word message, wherein, dictionary is for storing the target word of pending speech recognition;
In the present embodiment, the target word of storing in dictionary can be the word of broad scope, particularly, can from daily life and the information that can touch of working, obtain target word and form dictionary, for example: can from the information of news report every day, extract word, form dictionary; The target word of storing in dictionary can be also the word of narrow sense scope, and particularly, the POI acquisition of information target word that can store from POI database also forms dictionary.It should be noted that, no matter be the word of broad scope or the word of narrow sense scope, the target word in dictionary is all unique, between each target word, does not repeat.
In order to reduce the amount of redundancy of target word in dictionary, save storage space, improve the speed of speech recognition, the embodiment of the present invention preferably target word in dictionary is set to the narrow sense scope word arranging according to POI database, but be not limited to above-mentioned set-up mode, well known to a person skilled in the art to be, for applied each industry field of this recognition technology, the technician of described industry all can, according to its industry characteristic, rationally arrange its POI database.
In the present embodiment, the Word message that natural-sounding identification module 1032 can obtain according to unspecified person sound identification module 1031 is searched dictionary, word in Word message is mated with the target word comprising in dictionary according to appearance order, when finding the word mating completely with target word, this word is split from Word message, continue the above-mentioned action of searching of circulation, until the last character in Word message, thereby realize the word segmentation processing to Word message.
It should be noted that, in order to improve the speed of calling data, accelerate speech recognition speed, preferably, in the present embodiment, POI database and dictionary are all stored in (Fig. 1 is not shown) in cloud computing platform server 103.
Further, in the present embodiment, natural-sounding identification module 1032 can obtain the highest target P OI information of word match degree comprising with Word message by two kinds of modes from POI database, below these two kinds of modes is introduced respectively:
1, weight coefficient judgement method
Natural-sounding identification module 1032, if specifically for dictionary also for storing weight grade n and the weight rate range N that target word is corresponding, according to dictionary, obtain weight grade corresponding to each word that Word message comprises, POI database searched in the word comprising according to Word message, from POI information database, obtain the POI information aggregate of the POI information composition of any one or more word match that comprise with Word message, weight grade corresponding to each word comprising according to Word message, every POI information in POI information aggregate is processed respectively, obtain the weight coefficient of every POI information, the POI information that weight selection coefficient is the highest from POI information aggregate is target P OI information, wherein, n, N is integer, N >=2, n ∈ [1, N], it is large that the importance of the target word of n level in described Word message obtains the importance of target word in described Word message than n+1 level, certainly, its importance also can be contrary with the relation of weight grade n, those skilled in the art can oneself define as required, present embodiment is carried out example according to the former.
In the present embodiment, natural-sounding identification module 1032 can adopt Weighted Average Algorithm to obtain the weight coefficient of every information, can certainly adopt other algorithms to obtain the weight information of every information, does not repeat one by one herein.
It should be noted that, in order to guarantee the accuracy of the target P OI information that natural-sounding identification module 1032 obtains, improve speech recognition quality, in the present embodiment, in word after 1032 pairs of Word message participles of natural-sounding identification module, should comprise the word that at least one weight grade is 1, if after word segmentation processing, in the word that Word message comprises, not having weight grade is 1 word, natural-sounding identification module 1032, also, for again Word message being carried out to word segmentation processing, obtain the word that at least one weight grade is 1.
Further, natural-sounding identification module 1032, also for being 1 by above-mentioned at least one weight grade of obtaining, word adds dictionary to.
It should be noted that, the embodiment of the present invention is carried out concrete giving an example to the division of weight grade height, in actual use procedure, the height attribute of weight grade can also be set by other rules, for example: when weight rate range is 3, weight grade can be set be 3 the highest, weight grade is 1 minimum, and above method is that those skilled in the art can associate easily under the prerequisite of not paying creative work, repeats no longer one by one herein.
2, the nested method of searching
Natural-sounding identification module 1032, specifically for the word that Word message is comprised, sort, according to the result of sequence, the word comprising from Word message, obtain first word to be found, from POI information database, obtain the POI information with first word match to be found, the word comprising from Word message, obtain second word to be found, the POI information aggregate forming from the POI information with first word match to be found, obtain the POI information with second word match to be found, by that analogy, the word comprising from Word message, obtain last word to be found, in the POI information aggregate that the POI information of a word match forms from adjacent with described last word to be found, obtain the target P OI information with last word match to be found.
In the present embodiment, natural-sounding identification module 1032 can sort word according to the sequencing occurring in Word message, preferably, in order to improve seek rate, natural-sounding identification module 1032 can first obtain the keyword in the word that Word message comprises, and the word then Word message being comprised sorts according to the order of keyword, rear auxiliary word and front auxiliary word.
Wherein, keyword is to have the proprietary word that refers to meaning, and rear auxiliary word is in Word message, to be positioned at keyword word afterwards, and front auxiliary word is in Word message, to be positioned at keyword word before.
In the present embodiment, cloud computing platform server 103 (being specially natural-sounding identification module 1032) can set in advance antistop list, this antistop list can be according to canned data setting in POI database, natural-sounding identification module 1032 is after obtaining the word that Word message comprises, each comprised word is searched respectively to antistop list, and obtaining the word mating with the keyword of storing in antistop list is the keyword that Word message comprises.
It should be noted that, if know after searching in the word that Word message comprises and do not have keyword, the sequencing that natural-sounding identification module 1032 occurs in Word message according to word sorts; Further, if know after searching and comprise more than two keyword in Word message, auxiliary word is the later non-key word of first keyword in the word that comprises of Word message afterwards, and natural-sounding identification module 1032 still sorts according to the order of keyword, rear auxiliary word and front auxiliary word.
Natural-sounding identification module 1032 sorts according to the order of keyword, rear auxiliary word and front auxiliary word by the word that Word message is comprised, make follow-uply according to word order, to search when coupling, keynote message is outstanding, can significantly shorten the time that coupling searched in word, improve the speed of speech recognition.
It should be noted that, if natural-sounding identification module 1032 does not find the information with current word match to be found, match information that can current word to be found is set to the information of a upper to be found word match adjacent with this current word to be found, if, current word to be found is first word to be found, and the information of this first word match to be found is the POI information comprising in whole POI database.
In order to make those skilled in the art have more deep understanding to the above-described nested method of searching, below by concrete example, nested specific implementation of searching method is described:
For example: the Word message obtaining after identification is resolved when the navigation position voice messaging of user input is: during the little fertile sheep chafing dish restaurant in anistree East Road, Shijingshan District, Beijing, the word that the Word message obtaining after natural-sounding identification module 1032 word segmentation processing comprises can be: Beijing, Shijingshan District, anistree, East Road, little fertile sheep, chafing dish restaurant, if little fertile sheep is keyword, according to keyword, rear auxiliary word and front auxiliary word sequence, be: little fertile sheep, chafing dish restaurant, Beijing, Shijingshan District, anistree, East Road, when POI database comprises: little Fei Yang supermarket, Beijing, ancient city, Shijingshan District Lu little Fei sheep chafing dish restaurant, Donglaishun, Beijing chafing dish restaurant, Donglaishun, anistree North Road, Beijing chafing dish restaurant, during the information such as the anistree little fertile sheep chafing dish restaurant in Beijing, according to the above-mentioned nested method of searching, first, natural-sounding identification module 1032 obtains the POI information of mating with " little fertile sheep " from POI database, form a POI information bank, the one POI information bank comprises: little Fei Yang supermarket, Beijing, ancient city, Shijingshan District Lu little Fei sheep chafing dish restaurant, the anistree little fertile sheep chafing dish restaurant in Beijing, then, from a POI information bank, obtain the POI information of mating with " chafing dish restaurant ", form the 2nd POI information bank, the 2nd POI information bank comprises: ancient city, Shijingshan District Lu little Fei sheep chafing dish restaurant, the anistree little fertile sheep chafing dish restaurant in Beijing, the 3rd, from the 2nd POI information bank, obtain the POI information of mating with " Beijing ", form the 3rd POI information bank, the 3rd POI information bank comprises: the anistree little fertile sheep chafing dish restaurant in Beijing, the 4th, from the 3rd POI information bank, obtain the POI information of mating with " anise ", form the 4th POI information bank, the 4th POI information bank comprises: the anistree little fertile sheep chafing dish restaurant in Beijing, the 5th, from the 4th POI information bank, obtain the target P OI information of mating with " East Road ", due to the POI information of not mating with " East Road " in the 4th POI information bank, so target P OI information is the POI information comprising in the 4th POI information bank, i.e. the anistree little fertile sheep chafing dish restaurant in Beijing.
By above-described weight coefficient judgement method and the nested method of searching, natural-sounding identification module 1032 can find the highest target P OI information of word match degree comprising with text message exactly, realizes the identification to the voice messaging of user's input.Certainly, in actual use procedure, the highest target P OI information of word match degree that natural-sounding identification module 1032 can also adopt other modes to obtain to comprise with text message does not repeat herein one by one.
Communication module 1033, navigation purpose address corresponding to target P OI information obtaining for obtaining nature sound identification module 1032, is carried at this navigation purpose address in self-navigation control information, to send to terminal device 102.
Further, if natural-sounding identification module 1032 has been chosen two above target P OI information, in order to improve the accurately fixed of speech recognition, as shown in Figure 1, terminal device 102, two above target P OI information that can also send for receiving cloud computing platform server 103, these two above target P OI information are shown to user, receive user and choose indication according to the POI information of described two above target P OI information transmissions, POI information is chosen to indication and send to cloud computing platform server 103;
Particularly, terminal device 102 can receive the POI information that user sends by modes such as voice or button or word inputs and choose indication.It should be noted that, if user sends POI information by voice mode and chooses indication, cloud computing platform server 103 need to adopt 1031 pairs of these POI information of unspecified person sound identification module choose indication identify, resolve, obtain corresponding steering order.
Cloud computing platform server 103, if can also find two above target P OI information for natural-sounding identification module 1032, described communication module sends to terminal device 102 by described two above target P OI information, the POI information that receiving terminal apparatus 102 returns is chosen indication, according to this POI information, choose indication and from more than two target P OI information, choose selected objective target POI information, and obtain navigation purpose address corresponding to this selected objective target POI information.
Or as shown in Figure 2, cloud computing platform server 103, also comprises:
Statistical module 1034, adds up for navigation data, preserves navigation data statistics;
In the present embodiment, the POI information that statistical module 1034 can carry out speech recognition to user is at every turn added up, and this statistics can be for specific user individual, also can be for specific user colony.Further, this speech recognition statistics can be that one or more target P OI information of user is carried out to the number of times of speech recognition or the result of frequency statistics, also can be a plurality of users to be carried out for the last time to the statistics of the target P OI information of speech recognition, certainly can also, for other statisticses relevant to speech recognition, not repeat one by one herein.
Communication module 1033, if can also find two above target P OI information for natural-sounding identification module 1032, from statistical module 1034, obtain navigation data statistics, according to this navigation data statistics, from more than two target P OI information, choose selected objective target POI information, and obtain navigation purpose address corresponding to this selected objective target POI information.
For example: when result that navigation data statistics is added up for number of times that a plurality of target P OI information of user are carried out to speech recognition, if the text message corresponding to navigation position voice messaging of user's input is little fertile sheep chafing dish restaurant, natural-sounding identification module 1032 has obtained 3 objective POI information, comprise: the little fertile sheep chafing dish restaurant in Haidian District, the little fertile sheep chafing dish restaurant in Zhong Guan-cun, Haidian District, during anistree East Road, Shijingshan little fertile sheep chafing dish restaurant, communication module 1033 can be obtained speech recognition statistics navigation data statistics corresponding to described 3 objective POI information, as " the little fertile sheep chafing dish restaurant in Haidian District " carries out speech recognition 3 times, " the little fertile sheep chafing dish restaurant in Zhong Guan-cun, Haidian District " carries out speech recognition 5 times, " the little fertile sheep chafing dish restaurant in anistree East Road, Shijingshan " carries out speech recognition 40 times, communication module 1033 can be according to statistics, from 3 objective POI information, choosing " the little fertile sheep chafing dish restaurant in anistree East Road, Shijingshan " is selected objective target POI information.
Alternatively, in order further to shorten the time of speech recognition, improve speech recognition speed, in the present embodiment, natural-sounding identification module 1032, can also search spoken dictionary for the word comprising according to Word message, according to lookup result, the word comprising from Word message, delete spoken word, wherein, spoken dictionary is used for storing spoken word, does not comprise the Word message in the navigation position voice messaging that relates to user's input with substantive implication in spoken word.
In the present embodiment, can adopt the method for statistics to set in advance spoken dictionary, in this spoken language dictionary, can comprise people's spoken word used in everyday, for example: " I think ", " I want ", " may I ask ", " being ", " right ", " can " and " how " etc., the spoken word comprising in spoken word storehouse is not repeated one by one herein.
The navigational system based on natural-sounding identification that the embodiment of the present invention provides, user presses after the start key that is arranged on the one-key type control device on steering wheel for vehicle, terminal device is set up voice conversation with cloud computing platform server and is connected, and system is carried out automatic speech navigational state.When user sends navigation position voice messaging by terminal device to cloud computing platform server, cloud computing platform server can first adopt unspecified person speech recognition technology to identify parsing to navigation position voice messaging, obtain corresponding text message, then adopt the word that Word message comprises to carry out information matches, using the highest POI information of the word match degree comprising with Word message in POI database as identification obtains to navigation position voice messaging target P OI information, cloud computing platform server does not need the navigation position voice messaging of user's transmission to mate can obtain target P OI information completely, improved the success ratio of Chinese speech recognition, and then improved the service experience that Voice Navigation reliability of service and user use Voice Navigation service.Having solved prior art adopts and voice messaging to be carried out to complete matching process carries out speech recognition, cause and because form of presentation is inconsistent, make speech recognition failure, speech recognition success ratio is low, cause Voice Navigation reliability of service poor, user uses the bad problem of service experience of Voice Navigation service, in the technical scheme providing due to the embodiment of the present invention, cloud computing platform server adopts the mode of word match to carry out speech recognition, only need in dictionary, store target word and storage standards POI information in POI database, do not need same thing to store a large amount of multi-form text messages according to language expression mode, the data scale of dictionary and POI database is less, be convenient to search, and then improved speech recognition speed, solve prior art and need in vocabulary, to same thing, store the text message of a large amount of different expression forms, cause vocabulary in large scale, be not easy to search, the speed of carrying out speech recognition is slower, cause Voice Navigation service system to postpone larger problem.The natural-sounding recognition technology that in the technical scheme that the embodiment of the present invention provides, cloud computing platform server adopts is different from English speech recognition technology, this natural-sounding recognition technology is large for Chinese language word amount, the feature that in statement, word is coherent, nothing is paused, employing is to statement participle, and the mode of searching according to word carries out speech recognition, higher to the success ratio of Chinese speech recognition and recognition speed.
As shown in Figure 3, the embodiment of the present invention also provides a kind of air navigation aid based on natural-sounding identification, comprising:
Step 301, press the startup button of one-key type control device user after, one-key type control device connects by direct or short haul connection mode and terminal device, wherein, one-key type control device is arranged on the fixed position of vehicle, directly or drive the cloud computing platform server of terminal device and network side to connect by short haul connection mode;
Step 302, terminal device is set up voice conversation by voice call exchange network or multiple radio data network with cloud computing platform server and is connected;
Step 303, terminal device receives the navigation position voice messaging that user sends, and navigation position voice messaging is sent to cloud computing platform server;
Step 304, cloud computing platform server adopts unspecified person speech recognition technology that navigation position voice messaging is identified, resolved, and obtains the Word message that navigation position voice messaging is corresponding;
Step 305, cloud computing platform server adopts the dictionary setting in advance to carry out word segmentation processing to Word message, obtains the word that Word message comprises, and wherein, described dictionary is for storing the target word of pending speech recognition;
Step 306, POI database searched in the word that cloud computing platform server comprises according to Word message, obtains the highest target P OI information of word match degree comprising with Word message from POI database;
Step 307, cloud computing platform server obtains navigation purpose address corresponding to target P OI information, and this navigation purpose address is carried in self-navigation control information and sends to terminal device;
Step 308, terminal device starts navigation feature according to self-navigation control information, connects with navigation server, obtains the navigation results of navigation purpose address from navigation server, and this result is shown to user.
Further, the air navigation aid based on natural-sounding identification that the embodiment of the present invention provides can also comprise: if dictionary is also for storing weight grade n and the weight rate range N that target word is corresponding, cloud computing platform server obtains according to described dictionary weight grade corresponding to each word that Word message comprises, wherein, n, N are integer, N >=2, n ∈ [1, N], it is large that the importance of the target word of n level in described Word message obtains the importance of target word in Word message than n+1 level;
As shown in Figure 4, step 306 can comprise:
Step 3061, POI database searched in the word that cloud computing platform server comprises according to Word message, obtains the POI information aggregate of the POI information composition of any one or more word match that comprise with Word message from POI database;
Step 3062, weight grade corresponding to each word that cloud computing platform server comprises according to Word message, processes respectively every POI information in POI information aggregate, obtains the weight coefficient of every POI information;
Step 3063, the cloud computing platform server POI information that weight selection coefficient is the highest from POI information aggregate is target P OI information.
Further, in order to improve the accuracy of speech recognition, the air navigation aid based on natural-sounding identification that the embodiment of the present invention provides can also comprise: if there is not weight grade in the word that Word message comprises, be 1 word, cloud computing platform server carries out word segmentation processing to Word message again, obtains the word that at least one weight grade is 1.
On this basis, the air navigation aid based on natural-sounding identification that the embodiment of the present invention provides can also comprise: the word that cloud computing platform server is 1 by least one weight grade adds in dictionary.
Further, as shown in Figure 5, step 306 can comprise:
Step 3064, cloud computing platform server pair, the word that Word message comprises sorts;
Particularly, step 3064 can comprise: cloud computing platform server obtains the keyword in the word that Word message comprises; The word that cloud computing platform server comprises Word message sorts according to the order of keyword, rear auxiliary word and front auxiliary word; Wherein, rear auxiliary word is in described Word message, to be positioned at keyword word afterwards, and front auxiliary word is in described Word message, to be positioned at keyword word before.
It should be noted that, if there are two above keywords in the word that Word message comprises, rear auxiliary word is the later non-key word of first keyword in the word that comprises of Word message.
Step 3065, cloud computing platform server, according to the result of sequence, obtains first word to be found from Word message the word comprising, obtain the POI information with first word match to be found from POI database;
Step 3066, cloud computing platform server from, in the word that Word message comprises, obtain second word to be found, the POI information aggregate forming from POI information with first word match to be found, obtain the POI information with second word match to be found;
By that analogy, step 3067, the word that cloud computing platform server comprises from Word message, obtain last word to be found, in the POI information aggregate that the POI information of a word match forms from adjacent with last word to be found, obtain the target P OI information with last word match to be found.
Further, if more than cloud computing platform whois lookup to two target P OI information in step 306, the air navigation aid based on natural-sounding identification that the embodiment of the present invention provides can also comprise: cloud computing platform server sends more than two target P OI information to terminal device; Terminal device is shown to user by two above target P OI information, receives user and chooses indication according to the POI information of two above target P OI information transmissions; Terminal device is chosen indication by POI information and is sent to cloud computing platform server; Cloud computing platform server is chosen indication according to POI information and from more than two target P OI information, is chosen selected objective target POI information, and obtains navigation purpose address corresponding to this selected objective target POI information.
Or the air navigation aid based on natural-sounding identification that the embodiment of the present invention provides can also comprise: cloud computing platform server obtains navigation data statistics; Cloud computing platform server according to navigation data statistics from the described selected objective target POI information of more than two choosing target P OI information.
Alternatively, in order further to improve the speed that cloud computing platform server carries out speech recognition, as shown in Figure 6, after step 305, before step 306, can also comprise:
Step 309, spoken dictionary searched in the word that cloud computing platform server comprises according to Word message, according to lookup result, the word comprising from Word message, delete spoken word, wherein, spoken dictionary is used for storing spoken word, does not comprise the Word message in the voice messaging that relates to described user's input with substantive implication in spoken word.
Described in the navigational system based on natural-sounding identification that the specific implementation process of the air navigation aid based on natural-sounding identification that the embodiment of the present invention provides can provide referring to the embodiment of the present invention, repeat no more herein.
The air navigation aid based on natural-sounding identification that the embodiment of the present invention provides, user presses after the start key that is arranged on the one-key type control device on steering wheel for vehicle, terminal device is set up voice conversation with cloud computing platform server and is connected, and system is carried out automatic speech navigational state.When user sends navigation position voice messaging by terminal device to cloud computing platform server, cloud computing platform server can first adopt unspecified person speech recognition technology to identify parsing to navigation position voice messaging, obtain corresponding text message, then adopt the word that Word message comprises to carry out information matches, using the highest POI information of the word match degree comprising with Word message in POI database as identification obtains to navigation position voice messaging target P OI information, cloud computing platform server does not need the navigation position voice messaging of user's transmission to mate can obtain target P OI information completely, improved the success ratio of Chinese speech recognition, and then improved the service experience that Voice Navigation reliability of service and user use Voice Navigation service.Having solved prior art adopts and voice messaging to be carried out to complete matching process carries out speech recognition, cause and because form of presentation is inconsistent, make speech recognition failure, speech recognition success ratio is low, cause Voice Navigation reliability of service poor, user uses the bad problem of service experience of Voice Navigation service, in the technical scheme providing due to the embodiment of the present invention, cloud computing platform server adopts the mode of word match to carry out speech recognition, only need in dictionary, store target word and storage standards POI information in POI database, do not need same thing to store a large amount of multi-form text messages according to language expression mode, the data scale of dictionary and POI database is less, be convenient to search, and then improved speech recognition speed, solve prior art and need in vocabulary, to same thing, store the text message of a large amount of different expression forms, cause vocabulary in large scale, be not easy to search, the speed of carrying out speech recognition is slower, cause Voice Navigation service system to postpone larger problem.The natural-sounding recognition technology that in the technical scheme that the embodiment of the present invention provides, cloud computing platform server adopts is different from English speech recognition technology, this natural-sounding recognition technology is large for Chinese language word amount, the feature that in statement, word is coherent, nothing is paused, employing is to statement participle, and the mode of searching according to word carries out speech recognition, higher to the success ratio of Chinese speech recognition and recognition speed.
Air navigation aid and system based on natural-sounding identification that the embodiment of the present invention provides, can be applied in navigation field.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited to this, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; can expect easily changing or replacing, within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion by the described protection domain with claim.