CN107578773A - A kind of method for quickly identifying of languages - Google Patents

A kind of method for quickly identifying of languages Download PDF

Info

Publication number
CN107578773A
CN107578773A CN201710664324.9A CN201710664324A CN107578773A CN 107578773 A CN107578773 A CN 107578773A CN 201710664324 A CN201710664324 A CN 201710664324A CN 107578773 A CN107578773 A CN 107578773A
Authority
CN
China
Prior art keywords
languages
phonetic notation
feature words
analyzed
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710664324.9A
Other languages
Chinese (zh)
Inventor
梁镇爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Global Tone Communication Technology Qingdao Co Ltd
Original Assignee
Global Tone Communication Technology Qingdao Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Global Tone Communication Technology Qingdao Co Ltd filed Critical Global Tone Communication Technology Qingdao Co Ltd
Priority to CN201710664324.9A priority Critical patent/CN107578773A/en
Publication of CN107578773A publication Critical patent/CN107578773A/en
Pending legal-status Critical Current

Links

Landscapes

  • Machine Translation (AREA)

Abstract

The present invention provides a kind of method for quickly identifying of languages, including:Collection voice messaging simultaneously intercepts and captures Feature Words;Feature Words are analyzed, obtain gene profile and formant vector;Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;Tagged word phonetic notation in Feature Words is depicted as phonetic notation sequential chart, phonetic notation collating sequence is obtained, phonetic notation collating sequence is configured to Feature Words signal;By Feature Words Signal coding, sample value is formed;Sample value is compared with model storehouse, languages identification code corresponding to acquisition.First, the present invention is that the mode based on analysis phonetic notation carries out feature extraction, without the algorithm of complexity, avoids huge system and computing, therefore also without the equipment of Large Copacity, and the validity and accuracy of identification can be ensured.

Description

A kind of method for quickly identifying of languages
Technical field
The present invention relates to languages identification technology, more particularly to voice-based languages identification technology.
Background technology
Speech recognition technology, more and more extensive use is to using and be applied among the life of people.Such as from technology In angle:The speech recognition technology of apple is global leader.From application angle in, speech recognition have application to audio-switch, Among the practical applications such as digital map navigation, payment, voice typewriting.However, it is apple speech recognition or Baidu's speech recognition, all Need to rely on the language version of system, both system was Chinese information processing system, and speech recognition can only identify Chinese;System is department of english System, speech recognition can only identify English.If not calculating the dialect of ethnic group, the whole world there are about 2790 multilinguals, then How to solve on same system platform, quickly judge languages using speech recognition technology, and rung with different languages It should operate, turn into this area technical issues that need to address.
In addition to tourism and translation, in daily life, the problem of also facing different language None- identified, for example, In terms of information service, multilingual service can be provided in many information inquiries, but must prompt to use with multilingual at the beginning Family selects user language.Language Identification system must distinguish the category of language of user in advance, to provide the clothes of different language species Business.The example of this kind of exemplary service includes travel information, emergency service and shopping and bank, stock exchange.Or make The bluetooth connected with speech recognition technology, user are set by saying " pairing bluetooth ", " Bluetooth pairing " to certain brand mobile phone with bluetooth The application of function of standby connection, if it is English that mobile phone, which is Chinese mobile phone, bluetooth equipment, it can lead to not connect, it is similar in addition Also have the product for being frequently necessary to use in the life such as vehicle mounted guidance, analogue is all can be potentially encountered in terms of speech recognition.
In order to solve this problem, also there is correlative study this area, for example, in the U.S., in order to preferably help foreigner Member gets help, and the use of GMM and HMM algorithms is the coding to whole language, such as language coding sheet and English code book, With whole code book, languages could be identified.But this technology has complicated algorithm, algorithm cost is very high to be applied Among enterprise and product;And code book needs the collection of almost full languages, collecting work amount can only government undertake;For common For user, the code book capacity of full language can not be used in mini-plant greatly very much again.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of method for quickly identifying of languages, solve the same equipment of system It is the problem of languages multi-lingual with system Direct Recognition, applied widely.
The technical proposal for solving the technical problem of the invention is:A kind of method for quickly identifying of languages, including it is following Step:
(1)Collection voice messaging simultaneously intercepts and captures Feature Words;
(2)Feature Words are analyzed, obtain gene profile and formant vector;
(3)Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;
(4)Tagged word phonetic notation in Feature Words is depicted as phonetic notation sequential chart, phonetic notation collating sequence is obtained, by phonetic notation collating sequence It is configured to Feature Words signal;
(5)By Feature Words Signal coding, sample value is formed;
(6)Sample value is compared with model storehouse, languages identification code corresponding to acquisition,
(7)Languages mark identification code is fed back into appointing system.
Further, the phonetic notation of tagged word is any two or more than two combinations in 21 initial consonants and 16 simple or compound vowel of a Chinese syllable.
Further, present invention additionally comprises model storehouse is established the step of, the step of establishing model storehouse, include:
(1)A unique corresponding languages identification code is distributed for a certain languages;
(2)Gather the voice messaging of the languages and intercept and capture Feature Words;
(3)Feature Words are analyzed, obtain gene profile and formant vector;
(4)Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;
(5)Tagged word phonetic notation is depicted as phonetic notation sequential chart, forms the Feature Words collection of illustrative plates corresponding to the languages;
(6)Phonetic notation is arranged in sequence, and by phonetic notation collating sequence construction feature word signal;
(7)Feature Words signal is encoded, forms the model value corresponding to the languages;
(8)The packing of languages identification code, Feature Words collection of illustrative plates and model value is stored as a model storehouse label;
(9)To other languages repeat steps(1)-(8)Until establishing model storehouse.
Further, in step(6)In, when model value all in sample value and model storehouse mismatches, mark should Languages corresponding to voice are new languages, and model storehouse is write using the method for claim 3 to new languages.
The beneficial effects of the invention are as follows:First, the present invention is that the mode based on analysis phonetic notation carries out feature extraction, without Complicated algorithm, avoids huge system and computing, therefore can answer also without the equipment of Large Copacity, method of the invention For many middle-size and small-size portable equipments, the scope of application is expanded, the present invention uses computer binary technique, and phonetic notation is carried out Coding, without the substantial amounts of repetition training stage, the development cost of system is reduced, system caused by avoiding multi-language version is soft Part difference, global unified standard can be accomplished.The recognition methods of the present invention can recognize multilingual and dialect, and can ensure to identify Validity and accuracy.
Brief description of the drawings
Fig. 1 is the schematic diagram of the present invention.
Fig. 2 is the use figure of the present invention.
Embodiment
Referring to the drawings 1.
The audio recognition method of the present invention is realized based on following principle:
(1)The Feature Words for intercepting and capturing tagged word composition carry out voice collecting for minimum unit;
(2)Feature Words are analyzed, obtain pitch contour and formant vector;
(3)Pitch contour and formant vector are analyzed, obtains the phonetic notation that wherein tagged word is formed(Phonetic notation 37 includes:Initial consonant 21, simple or compound vowel of a Chinese syllable 16);
(4)By the phonetic notation sequential chart of the tagged word in Feature Words, phonetic notation collating sequence is obtained, Feature Words are configured to by collating sequence Signal;
(5)Using computer binary technique, by Feature Words Signal coding, turn into binary message, form sample value;
(6)Sample value and languages identification code are bound, complete the identification to languages.
Feature Words are for example:The composition characteristic word Sample Storehouses such as hello, HELLO, こ ん To Chi は.
Feature Words remarks:Substantially each languages link up the starting with important term in a word, or specific word of language Converge, for example, " report ", " black and white ", " earth ".
For user, application method is as follows:
(1)To being mounted with that the equipment of the present invention says Feature Words " hello ";
(2)The phonetic feature word that user says is intercepted and captured, analysis sound wave establishes pitch contour and formant vector;
(3)Pitch contour and formant vector are analyzed, obtains the phonetic notation that wherein tagged word is formed(Phonetic notation 37 includes:Initial consonant 21, simple or compound vowel of a Chinese syllable 16);
(4)By the phonetic notation sequential chart of the tagged word in Feature Words, phonetic notation collating sequence is obtained, Feature Words are configured to by collating sequence Signal;
(5)Using computer binary technique, by Feature Words Signal coding, turn into binary message, form model value;
(6)The model value in model storehouse is compared, obtains languages identification code;
(7)Appointing system languages identification code being sent in equipment, language information is pushed to user.

Claims (4)

1. a kind of method for quickly identifying of languages, it is characterized in that, comprise the following steps:
(1)Collection voice messaging simultaneously intercepts and captures Feature Words;
(2)Feature Words are analyzed, obtain gene profile and formant vector;
(3)Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;
(4)Tagged word phonetic notation in Feature Words is depicted as phonetic notation sequential chart, phonetic notation collating sequence is obtained, by phonetic notation collating sequence It is configured to Feature Words signal;
(5)By Feature Words Signal coding, sample value is formed;
(6)Sample value is compared with model storehouse, languages identification code corresponding to acquisition,
(7)Languages mark identification code is fed back into appointing system.
2. a kind of method for quickly identifying of languages according to claim 1, it is characterized in that, the phonetic notation that tagged word is formed is Any two or more than two combinations in 21 initial consonants and 16 simple or compound vowel of a Chinese syllable.
3. a kind of method for quickly identifying of languages according to claim 1, it is characterized in that, present invention additionally comprises establish model The step of storehouse, the step of establishing model storehouse, include:
(1)A unique corresponding languages identification code is distributed for a certain languages;
(2)Gather the voice messaging of the languages and intercept and capture Feature Words;
(3)Feature Words are analyzed, obtain gene profile and formant vector;
(4)Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;
(5)Tagged word phonetic notation is depicted as phonetic notation sequential chart, forms the Feature Words collection of illustrative plates corresponding to the languages;
(6)Phonetic notation is arranged in sequence, and by phonetic notation collating sequence construction feature word signal;
(7)Feature Words signal is encoded, forms the model value corresponding to the languages;
(8)The packing of languages identification code, Feature Words collection of illustrative plates and model value is stored as a model storehouse label;
(9)To other languages repeat steps(1)-(8)Until establishing model storehouse.
4. a kind of method for quickly identifying of languages according to claim 1, it is characterized in that, in step(6)In, work as sample value When being mismatched with model value all in model storehouse, it is new languages to mark the languages corresponding to the voice, and new languages are used The method write-in model storehouse of claim 3.
CN201710664324.9A 2017-08-07 2017-08-07 A kind of method for quickly identifying of languages Pending CN107578773A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710664324.9A CN107578773A (en) 2017-08-07 2017-08-07 A kind of method for quickly identifying of languages

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710664324.9A CN107578773A (en) 2017-08-07 2017-08-07 A kind of method for quickly identifying of languages

Publications (1)

Publication Number Publication Date
CN107578773A true CN107578773A (en) 2018-01-12

Family

ID=61034358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710664324.9A Pending CN107578773A (en) 2017-08-07 2017-08-07 A kind of method for quickly identifying of languages

Country Status (1)

Country Link
CN (1) CN107578773A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111487898A (en) * 2019-01-28 2020-08-04 智同科技股份有限公司 Voice control electrical equipment with language type discrimination

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111487898A (en) * 2019-01-28 2020-08-04 智同科技股份有限公司 Voice control electrical equipment with language type discrimination

Similar Documents

Publication Publication Date Title
US9852728B2 (en) Process for improving pronunciation of proper nouns foreign to a target language text-to-speech system
CN110444198B (en) Retrieval method, retrieval device, computer equipment and storage medium
CN104380375B (en) Device for extracting information from a dialog
US8756064B2 (en) Method and system for creating frugal speech corpus using internet resources and conventional speech corpus
US20190164540A1 (en) Voice recognition system and voice recognition method for analyzing command having multiple intents
US7840399B2 (en) Method, device, and computer program product for multi-lingual speech recognition
CN111508479B (en) Voice recognition method, device, equipment and storage medium
US20140244258A1 (en) Speech recognition method of sentence having multiple instructions
CN109192225B (en) Method and device for recognizing and marking speech emotion
CN109785829B (en) Customer service assisting method and system based on voice control
TW201337911A (en) Electrical device and voice identification method
US20180350390A1 (en) System and method for validating and correcting transcriptions of audio files
CN111881297A (en) Method and device for correcting voice recognition text
CN111144102A (en) Method and device for identifying entity in statement and electronic equipment
CN110852075A (en) Voice transcription method and device for automatically adding punctuation marks and readable storage medium
KR20220130739A (en) speech recognition
CN110503956B (en) Voice recognition method, device, medium and electronic equipment
WO2023045186A1 (en) Intention recognition method and apparatus, and electronic device and storage medium
CN111144118A (en) Method, system, device and medium for identifying named entities in spoken text
WO2023272616A1 (en) Text understanding method and system, terminal device, and storage medium
CN105096945A (en) Voice recognition method and voice recognition device for terminal
KR20160055059A (en) Method and apparatus for speech signal processing
CN107578773A (en) A kind of method for quickly identifying of languages
US10600405B2 (en) Speech signal processing method and speech signal processing apparatus
CN114528851A (en) Reply statement determination method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20180112

RJ01 Rejection of invention patent application after publication