CN106205613B - A kind of navigation audio recognition method and system - Google Patents

A kind of navigation audio recognition method and system Download PDF

Info

Publication number
CN106205613B
CN106205613B CN201610587485.8A CN201610587485A CN106205613B CN 106205613 B CN106205613 B CN 106205613B CN 201610587485 A CN201610587485 A CN 201610587485A CN 106205613 B CN106205613 B CN 106205613B
Authority
CN
China
Prior art keywords
result
similarity
recognition
client
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610587485.8A
Other languages
Chinese (zh)
Other versions
CN106205613A (en
Inventor
梁国锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Maitu Information Technology Co ltd
Original Assignee
Guangzhou Maitu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Maitu Information Technology Co Ltd filed Critical Guangzhou Maitu Information Technology Co Ltd
Priority to CN201610587485.8A priority Critical patent/CN106205613B/en
Publication of CN106205613A publication Critical patent/CN106205613A/en
Application granted granted Critical
Publication of CN106205613B publication Critical patent/CN106205613B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/52Network services specially adapted for the location of the user terminal

Abstract

The present invention provides a kind of navigation audio recognition method and system, the navigation audio recognition method is the following steps are included: the voice messaging that client is received carries out speech recognition by speech engine;Background data base is retrieved according to identification region to the result of speech recognition, is recorded if there is corresponding data, then recognition result is returned into client;Such as there is no corresponding datas to record, then from data in the default territorial scope of searching in local data base, with recognition result similarity greater than preset value, if there is similarity greater than preset value as a result, the result that similarity is greater than preset value is then recommended user according to frequency of use descending;If locally without similarity be greater than preset value as a result, if return it in background server database, according to all records of similarity descending sort, be back to client.Technical solution of the present invention can recommend closer to user's input as a result, reducing search time according to recognition result.

Description

A kind of navigation audio recognition method and system
Technical field
The invention belongs to technical field of voice recognition more particularly to a kind of navigation audio recognition method and systems.
Background technique
Speech recognition technology is one of the development in science and technology technology that information technology field ten is important greatly nearly ten years, main to apply In phonetic dialing, Voice Navigation, indoor equipment control, phonetic search, dictation data inputting etc..Speech recognition technology generally comprises Acoustic model and language model, acoustic model is responsible for completing the conversion of voice to phoneme, wherein the sound in the voice such as English The sound of mark and the phonetic in Chinese is female;Language model is responsible for completing the conversion of phoneme to text, and the two cooperation is completed voice and arrived The identification process of text.
General in the prior art there are three types of speech recognition technologies: the first is the speech recognition skill based on cloud identification engine Art, second is the speech recognition technology based on local speech recognition engine, the third is to be drawn simultaneously based on local voice identification Hold up the speech recognition technology with cloud identification engine.But regardless of any one of the above speech recognition is used, since voice is known Other process is intelligent, it may appear that the case where unisonance difference word, therefore just will appear multiple recognition results.When there are multiple identifications When as a result, being ranked up to recognition result is particularly important.In the prior art be usually by recognition result according to similarity by The recognition result of high to Low sequence returns to user.In addition, existing speech recognition engine discrimination is low, big portion under normal environment Recognition result is divided to differ larger with expected result.Furthermore it while being returned to according to the recognition result that similarity sorts from high to low User can to pronounce non-type user to require to take much time every time and look for required recognition result.
Summary of the invention
Against the above technical problems, the invention discloses a kind of navigation audio recognition method and system, pass through specific identification Region and the analysis of speech engine recognition result, and obtain the result closer to input voice.
In this regard, the technical solution adopted by the present invention are as follows:
A kind of navigation audio recognition method comprising following steps:
Step S1: the voice messaging that client is received carries out speech recognition by speech engine;
Step S2: background data base is retrieved according to identification region to the result of step S1 speech recognition, if there is correspondence Data record, then return to client for recognition result, and identification is completed;
It records, is then tied from the default territorial scope of searching in local data base and identification if there is no corresponding data Fruit similarity is greater than the data of preset value, if there is similarity greater than preset value as a result, being then greater than the similarity pre- If the data of value recommend user according to frequency of use descending sort;If be greater than in the local database without similarity default The recognition result is then back in background server database by the data of value, and background server calculates background server number According to the similarity of records and recognition result all in library, according to all records of similarity descending sort, and ranking results are returned To client;Wherein, the similar data refer to the data for being greater than default similarity;
Data as being also not greater than default similarity in background server database, then this speech recognition errors, and Feed back to client.
Wherein, local data base refers to the included storing data library of equipment.
It adopts this technical solution, realizes the identification to speech recognition through a variety of ways, and obtain in conjunction with specific identification region Take closer to input voice as a result, reducing search time.
As a further improvement of the present invention, in step S2, first choice judges the result of step S1 speech recognition, is No is effectively as a result, then being retrieved.Adopt this technical solution, first to recognition result carry out preliminary analysis, see whether be Effectively as a result, if it is not, then no longer carry out the retrieval of next step, makes result feedback faster, save the waste of unnecessary time.
As a further improvement of the present invention, in step S2, if there is similarity greater than preset value as a result, then by it User is recommended according to frequency of use descending sort, and obtains user current location information, according to territorial scope garbled data, instead It is fed to client.It adopts this technical solution, is analyzed in conjunction with user current location information, so that the result of discriminance analysis is more Accurately.
As a further improvement of the present invention, in step S2, if it is not similar in the local database as a result, if The recognition result is back in background server database, server calculate in background server database it is all record with The similarity of recognition result is back to client according to all records of similarity descending sort, and by ranking results, and obtains use Family current location information feeds back to client according to territorial scope garbled data.
As a further improvement of the present invention, in step S2, as being also not greater than default phase in background server database Like the data of degree, then this speech recognition errors, and feed back to client, while inform user re-enter voice messaging or Determiner or expansion word is added.
As a further improvement of the present invention, in step S2, the default similarity is not less than 50%.
The invention also discloses a kind of speech recognition system, the speech recognition system includes speech reception module, voice Identification module, local data base, communication module, background server and sending module;Wherein, the speech reception module, is used for Receive the voice messaging that client is sent;The speech recognition module, for voice messaging to be retrieved backstage according to identification region Server data library lookup corresponding data record, or in territorial scope and recognition result is preset from finding in local data base Similar data;The recognition result is then back to background server for connecting with background server by the communication module In database;The background server, it is similar to recognition result for calculating all records in background server database Degree, according to all records of similarity descending sort;Sending module, for ranking results to be back to client.
Compared with prior art, the invention has the benefit that
It using technical solution of the present invention, is analyzed, and is obtained more by specific identification region and speech engine recognition result Close to input voice as a result, reduction search time, more convenient to use.Technical solution of the present invention can be applied in each need It wants on the platform of speech recognition, especially intelligent terminal.
Detailed description of the invention
Fig. 1 is the flow chart of an embodiment of the present invention.
Specific embodiment
With reference to the accompanying drawing, preferably embodiment of the invention is described in further detail.
A kind of navigation audio recognition method, as shown in Figure 1, after obtaining recognition result using third party's speech engine first, It retrieves and sees with the presence or absence of corresponding note in platform server database after passing through in the identification of the voice content according to selected by user region Record, i.e., identical record;If it exists, then the recognition result obtained speech engine is sent to client;Such result is only Only one recognition result corresponds to a data in background server searching database;If comparing nothing in the result and database of identification It is identical, then it is assumed that recognition result is there are certain falsehood, while concurrent sending voice recognition result, reports to client knowledge There are falsehoods to enter simultaneously in next step for other result;Then in default territorial scope and recognition result is found in the local database Then similar data recommend user with the arrangement of frequency of use descending, reduce user's search time;Wherein, described close As data refer to the data for being greater than default similarity.In addition, if finding in the local database similar less than greater than presetting The recognition result is then back in background server retrieval data by the data of degree, and background server calculates institute in database There is the similarity of record and recognition result, according to all records of similarity descending sort, and ranking results is back to client. As background server database is equally also found less than the data for being greater than default similarity, then it is assumed that this speech recognition errors, And client is fed back to, it reminds user to re-enter voice messaging or determiner or expansion word is added.
For example, user is at requiring to navigate to Guangzhou Guang Zhouta, it is preset in Chengdu input voice " Guang Zhouta " Territorial scope is that districts under city administration are the place name for only searching for Chengdu, then can not search number corresponding with " Guang Zhouta " on backstage According to;It then searches in the local database in next step, as there is " optical axis tower " " Guang Zhouta " and " stroll week it " in local data base " Guang Zhouta ", frequency are respectively A, B, C and D, D > C > B > A, then the information being presented to the user is to be followed successively by from top to bottom " Guang Zhouta ", " stroll week it ", " Guang Zhouta ", " optical axis tower " facilitates client to choose, identifies successfully.
If the preset similarity of local data base is greater than 50%, and its recognition result " Guang Zhouta " can not find phase therewith It is greater than 50% data like degree, is retrieved then recognition result " Guang Zhouta " is fed back in background server database, passes through backstage Server calculates all records and the similarities of recognition result in database, by similarity be more than preset threshold data according to All records of similarity descending sort, and ranking results are back to client.
The similarity of all records and recognition result in database is such as calculated by server-side, similarity is respectively less than pre- If threshold value, then recognition failures, then feeding back to user's secondary voice input error, and prompt user to replace voice or addition Determiner or expansion word, such as Chengdu user need navigate be located at Guangzhou " Guang Zhouta ", then input voice information when It waits and adds expansion word, i.e., input voice is " Guangzhou Guang Zhouta ".
The above content is a further detailed description of the present invention in conjunction with specific preferred embodiments, and it cannot be said that Specific implementation of the invention is only limited to these instructions.For those of ordinary skill in the art to which the present invention belongs, exist Under the premise of not departing from present inventive concept, a number of simple deductions or replacements can also be made, all shall be regarded as belonging to of the invention Protection scope.

Claims (7)

1. a kind of navigation audio recognition method, it is characterised in that: itself the following steps are included:
Step S1: the voice messaging that client is received carries out speech recognition by speech engine;
Step S2: the identification region obtained to the result of step S1 speech recognition is retrieved in background server database, if There are corresponding data records, then recognition result are returned to client, and identification is completed;
Recorded if there is no corresponding data, then from found in local data base it is in default territorial scope, with recognition result phase It is greater than the data of preset value like degree, the data of preset value is greater than if there is similarity, then the similarity is greater than preset value Data recommend user according to frequency of use descending sort;If being greater than the number of preset value without similarity in the local database According to then the recognition result being back in background server database, background server calculates in background server database Ranking results according to all records of similarity descending sort, and are back to client by the similarity of all records and recognition result End;
Data as being also not greater than preset value in background server database, then this speech recognition errors, and feed back to visitor Family end.
2. navigation audio recognition method according to claim 1, it is characterised in that: in step S2, first choice is to step S1 language The result of sound identification is judged, if for effectively as a result, then being retrieved.
3. navigation audio recognition method according to claim 1, it is characterised in that: in step S2, if there is similarity Greater than preset value as a result, it is then recommended user according to frequency of use descending sort, and user current location information is obtained, According to territorial scope garbled data, client is fed back to.
4. navigation audio recognition method according to claim 1, it is characterised in that: in step S2, if in local data Do not have similar in library as a result, then the recognition result is back in background server database, server calculates backstage The similarity of all records and recognition result in server database, according to all records of similarity descending sort, and will sequence As a result it is back to client, and obtains user current location information, according to territorial scope garbled data, feeds back to client.
5. navigation audio recognition method described in any one according to claim 1 ~ 4, it is characterised in that: in step S2, as after Also it is not greater than the data of preset value in platform server database, then this speech recognition errors, and feeds back to client, simultaneously Inform that user re-enters voice messaging or determiner or expansion word is added.
6. navigation audio recognition method according to claim 5, it is characterised in that: in step S2, the preset value is not small In 50%.
7. a kind of speech recognition system, it is characterised in that: the speech recognition system includes speech reception module, speech recognition mould Block, local data base, communication module, background server and sending module;
Wherein, the speech reception module, for receiving the voice messaging of client transmission;
The speech recognition module obtains identification region for carrying out speech recognition to voice messaging, according to the retrieval of identification region Background server database lookup corresponding data record, or in territorial scope and identification is preset from finding in local data base As a result similarity is greater than the data of preset value;The communication module then returns the recognition result for connecting with background server It is back in background server database;
The background server, for calculating the similarity of all records and recognition result in background server database, root According to all records of similarity descending sort;
Sending module, for ranking results to be back to client.
CN201610587485.8A 2016-07-22 2016-07-22 A kind of navigation audio recognition method and system Active CN106205613B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610587485.8A CN106205613B (en) 2016-07-22 2016-07-22 A kind of navigation audio recognition method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610587485.8A CN106205613B (en) 2016-07-22 2016-07-22 A kind of navigation audio recognition method and system

Publications (2)

Publication Number Publication Date
CN106205613A CN106205613A (en) 2016-12-07
CN106205613B true CN106205613B (en) 2019-09-06

Family

ID=57491795

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610587485.8A Active CN106205613B (en) 2016-07-22 2016-07-22 A kind of navigation audio recognition method and system

Country Status (1)

Country Link
CN (1) CN106205613B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109101475B (en) * 2017-06-20 2021-07-27 北京嘀嘀无限科技发展有限公司 Travel voice recognition method and system and computer equipment
TW201921336A (en) 2017-06-15 2019-06-01 大陸商北京嘀嘀無限科技發展有限公司 Systems and methods for speech recognition
CN107993654A (en) * 2017-11-24 2018-05-04 珠海格力电器股份有限公司 A kind of voice instruction recognition method and system
CN108804070B (en) * 2018-05-30 2021-01-26 Oppo广东移动通信有限公司 Music playing method and device, storage medium and electronic equipment
CN111276147A (en) * 2019-12-30 2020-06-12 天津大学 Diet recording method based on voice input
CN114333828A (en) * 2022-03-08 2022-04-12 深圳市华方信息产业有限公司 Quick voice recognition system for digital product

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002123290A (en) * 2000-10-16 2002-04-26 Pioneer Electronic Corp Speech recognition device and speech recognition method
CN101290768A (en) * 2008-06-20 2008-10-22 清华大学 Voice inquiry method of great Chinese vocabulary based on embedded system environment
CN102968987A (en) * 2012-11-19 2013-03-13 百度在线网络技术(北京)有限公司 Speech recognition method and system
WO2015133142A1 (en) * 2014-03-06 2015-09-11 株式会社デンソー Reporting apparatus
CN105279227A (en) * 2015-09-11 2016-01-27 百度在线网络技术(北京)有限公司 Voice search processing method and device of homonym
CN105389400A (en) * 2015-12-24 2016-03-09 Tcl集团股份有限公司 Speech interaction method and device
CN105632499A (en) * 2014-10-31 2016-06-01 株式会社东芝 Method and device for optimizing voice recognition result

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002123290A (en) * 2000-10-16 2002-04-26 Pioneer Electronic Corp Speech recognition device and speech recognition method
CN101290768A (en) * 2008-06-20 2008-10-22 清华大学 Voice inquiry method of great Chinese vocabulary based on embedded system environment
CN102968987A (en) * 2012-11-19 2013-03-13 百度在线网络技术(北京)有限公司 Speech recognition method and system
WO2015133142A1 (en) * 2014-03-06 2015-09-11 株式会社デンソー Reporting apparatus
CN105632499A (en) * 2014-10-31 2016-06-01 株式会社东芝 Method and device for optimizing voice recognition result
CN105279227A (en) * 2015-09-11 2016-01-27 百度在线网络技术(北京)有限公司 Voice search processing method and device of homonym
CN105389400A (en) * 2015-12-24 2016-03-09 Tcl集团股份有限公司 Speech interaction method and device

Also Published As

Publication number Publication date
CN106205613A (en) 2016-12-07

Similar Documents

Publication Publication Date Title
CN106205613B (en) A kind of navigation audio recognition method and system
US7542966B2 (en) Method and system for retrieving documents with spoken queries
US10672391B2 (en) Improving automatic speech recognition of multilingual named entities
US7979425B2 (en) Server-side match
US8165877B2 (en) Confidence measure generation for speech related searching
CN103631802B (en) Song information searching method, device and corresponding server
US7415409B2 (en) Method to train the language model of a speech recognition system to convert and index voicemails on a search engine
US20080130699A1 (en) Content selection using speech recognition
CN101359254B (en) Character input method and system for enhancing input efficiency of name entry
CN103730115A (en) Method and device for detecting keywords in voice
KR20080069990A (en) Speech index pruning
CN105354199B (en) A kind of recognition methods of entity meaning and system based on scene information
CN102968987A (en) Speech recognition method and system
CN103885949A (en) Song searching system and method based on lyrics
CN109145095B (en) Place name information matching method, information matching device and computer equipment
CN104462105A (en) Server and Chinese character segmentation method and device
CN102322866A (en) Navigation method and system based on natural speech recognition
WO2012004955A1 (en) Text correction method and recognition method
CN111477231A (en) Man-machine interaction method, device and storage medium
CN103593338A (en) Information processing method and device
Ng Information fusion for spoken document retrieval
US20130297314A1 (en) Rescoring method and apparatus in distributed environment
CN102385597B (en) The fault-tolerant searching method of a kind of POI
CN105955986A (en) Character converting method and apparatus
CN112445902A (en) Method for identifying user intention in multi-turn conversation and related equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20190730

Address after: Room B512, Room 510000, Fifth Floor, 173 Jiangnan Avenue Middle Road, Haizhu District, Guangzhou City, Guangdong Province

Applicant after: Guangzhou Maitu Information Technology Co.,Ltd.

Address before: 518000 Pingshan Industrial Park, Taoyuan Street, Nanshan District, Shenzhen City, Guangdong Province

Applicant before: SHENZHEN ZHIMOU TECHNOLOGY CO.,LTD.

GR01 Patent grant
GR01 Patent grant
CP02 Change in the address of a patent holder
CP02 Change in the address of a patent holder

Address after: Room 1708, No. 180 Jiangnan Avenue Middle, Haizhu District, Guangzhou City, Guangdong Province, 510000 (office only)

Patentee after: Guangzhou Maitu Information Technology Co.,Ltd.

Address before: Room B512, 5th Floor, No. 173 Jiangnan Avenue Middle Road, Haizhu District, Guangzhou City, Guangdong Province, 510000

Patentee before: Guangzhou Maitu Information Technology Co.,Ltd.