CN107578773A - A kind of method for quickly identifying of languages - Google Patents
A kind of method for quickly identifying of languages Download PDFInfo
- Publication number
- CN107578773A CN107578773A CN201710664324.9A CN201710664324A CN107578773A CN 107578773 A CN107578773 A CN 107578773A CN 201710664324 A CN201710664324 A CN 201710664324A CN 107578773 A CN107578773 A CN 107578773A
- Authority
- CN
- China
- Prior art keywords
- languages
- phonetic notation
- feature words
- analyzed
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Machine Translation (AREA)
Abstract
The present invention provides a kind of method for quickly identifying of languages, including:Collection voice messaging simultaneously intercepts and captures Feature Words;Feature Words are analyzed, obtain gene profile and formant vector;Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;Tagged word phonetic notation in Feature Words is depicted as phonetic notation sequential chart, phonetic notation collating sequence is obtained, phonetic notation collating sequence is configured to Feature Words signal;By Feature Words Signal coding, sample value is formed;Sample value is compared with model storehouse, languages identification code corresponding to acquisition.First, the present invention is that the mode based on analysis phonetic notation carries out feature extraction, without the algorithm of complexity, avoids huge system and computing, therefore also without the equipment of Large Copacity, and the validity and accuracy of identification can be ensured.
Description
Technical field
The present invention relates to languages identification technology, more particularly to voice-based languages identification technology.
Background technology
Speech recognition technology, more and more extensive use is to using and be applied among the life of people.Such as from technology
In angle:The speech recognition technology of apple is global leader.From application angle in, speech recognition have application to audio-switch,
Among the practical applications such as digital map navigation, payment, voice typewriting.However, it is apple speech recognition or Baidu's speech recognition, all
Need to rely on the language version of system, both system was Chinese information processing system, and speech recognition can only identify Chinese;System is department of english
System, speech recognition can only identify English.If not calculating the dialect of ethnic group, the whole world there are about 2790 multilinguals, then
How to solve on same system platform, quickly judge languages using speech recognition technology, and rung with different languages
It should operate, turn into this area technical issues that need to address.
In addition to tourism and translation, in daily life, the problem of also facing different language None- identified, for example,
In terms of information service, multilingual service can be provided in many information inquiries, but must prompt to use with multilingual at the beginning
Family selects user language.Language Identification system must distinguish the category of language of user in advance, to provide the clothes of different language species
Business.The example of this kind of exemplary service includes travel information, emergency service and shopping and bank, stock exchange.Or make
The bluetooth connected with speech recognition technology, user are set by saying " pairing bluetooth ", " Bluetooth pairing " to certain brand mobile phone with bluetooth
The application of function of standby connection, if it is English that mobile phone, which is Chinese mobile phone, bluetooth equipment, it can lead to not connect, it is similar in addition
Also have the product for being frequently necessary to use in the life such as vehicle mounted guidance, analogue is all can be potentially encountered in terms of speech recognition.
In order to solve this problem, also there is correlative study this area, for example, in the U.S., in order to preferably help foreigner
Member gets help, and the use of GMM and HMM algorithms is the coding to whole language, such as language coding sheet and English code book,
With whole code book, languages could be identified.But this technology has complicated algorithm, algorithm cost is very high to be applied
Among enterprise and product;And code book needs the collection of almost full languages, collecting work amount can only government undertake;For common
For user, the code book capacity of full language can not be used in mini-plant greatly very much again.
The content of the invention
The technical problems to be solved by the invention are to provide a kind of method for quickly identifying of languages, solve the same equipment of system
It is the problem of languages multi-lingual with system Direct Recognition, applied widely.
The technical proposal for solving the technical problem of the invention is:A kind of method for quickly identifying of languages, including it is following
Step:
(1)Collection voice messaging simultaneously intercepts and captures Feature Words;
(2)Feature Words are analyzed, obtain gene profile and formant vector;
(3)Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;
(4)Tagged word phonetic notation in Feature Words is depicted as phonetic notation sequential chart, phonetic notation collating sequence is obtained, by phonetic notation collating sequence
It is configured to Feature Words signal;
(5)By Feature Words Signal coding, sample value is formed;
(6)Sample value is compared with model storehouse, languages identification code corresponding to acquisition,
(7)Languages mark identification code is fed back into appointing system.
Further, the phonetic notation of tagged word is any two or more than two combinations in 21 initial consonants and 16 simple or compound vowel of a Chinese syllable.
Further, present invention additionally comprises model storehouse is established the step of, the step of establishing model storehouse, include:
(1)A unique corresponding languages identification code is distributed for a certain languages;
(2)Gather the voice messaging of the languages and intercept and capture Feature Words;
(3)Feature Words are analyzed, obtain gene profile and formant vector;
(4)Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;
(5)Tagged word phonetic notation is depicted as phonetic notation sequential chart, forms the Feature Words collection of illustrative plates corresponding to the languages;
(6)Phonetic notation is arranged in sequence, and by phonetic notation collating sequence construction feature word signal;
(7)Feature Words signal is encoded, forms the model value corresponding to the languages;
(8)The packing of languages identification code, Feature Words collection of illustrative plates and model value is stored as a model storehouse label;
(9)To other languages repeat steps(1)-(8)Until establishing model storehouse.
Further, in step(6)In, when model value all in sample value and model storehouse mismatches, mark should
Languages corresponding to voice are new languages, and model storehouse is write using the method for claim 3 to new languages.
The beneficial effects of the invention are as follows:First, the present invention is that the mode based on analysis phonetic notation carries out feature extraction, without
Complicated algorithm, avoids huge system and computing, therefore can answer also without the equipment of Large Copacity, method of the invention
For many middle-size and small-size portable equipments, the scope of application is expanded, the present invention uses computer binary technique, and phonetic notation is carried out
Coding, without the substantial amounts of repetition training stage, the development cost of system is reduced, system caused by avoiding multi-language version is soft
Part difference, global unified standard can be accomplished.The recognition methods of the present invention can recognize multilingual and dialect, and can ensure to identify
Validity and accuracy.
Brief description of the drawings
Fig. 1 is the schematic diagram of the present invention.
Fig. 2 is the use figure of the present invention.
Embodiment
Referring to the drawings 1.
The audio recognition method of the present invention is realized based on following principle:
(1)The Feature Words for intercepting and capturing tagged word composition carry out voice collecting for minimum unit;
(2)Feature Words are analyzed, obtain pitch contour and formant vector;
(3)Pitch contour and formant vector are analyzed, obtains the phonetic notation that wherein tagged word is formed(Phonetic notation 37 includes:Initial consonant
21, simple or compound vowel of a Chinese syllable 16);
(4)By the phonetic notation sequential chart of the tagged word in Feature Words, phonetic notation collating sequence is obtained, Feature Words are configured to by collating sequence
Signal;
(5)Using computer binary technique, by Feature Words Signal coding, turn into binary message, form sample value;
(6)Sample value and languages identification code are bound, complete the identification to languages.
Feature Words are for example:The composition characteristic word Sample Storehouses such as hello, HELLO, こ ん To Chi は.
Feature Words remarks:Substantially each languages link up the starting with important term in a word, or specific word of language
Converge, for example, " report ", " black and white ", " earth ".
For user, application method is as follows:
(1)To being mounted with that the equipment of the present invention says Feature Words " hello ";
(2)The phonetic feature word that user says is intercepted and captured, analysis sound wave establishes pitch contour and formant vector;
(3)Pitch contour and formant vector are analyzed, obtains the phonetic notation that wherein tagged word is formed(Phonetic notation 37 includes:Initial consonant
21, simple or compound vowel of a Chinese syllable 16);
(4)By the phonetic notation sequential chart of the tagged word in Feature Words, phonetic notation collating sequence is obtained, Feature Words are configured to by collating sequence
Signal;
(5)Using computer binary technique, by Feature Words Signal coding, turn into binary message, form model value;
(6)The model value in model storehouse is compared, obtains languages identification code;
(7)Appointing system languages identification code being sent in equipment, language information is pushed to user.
Claims (4)
1. a kind of method for quickly identifying of languages, it is characterized in that, comprise the following steps:
(1)Collection voice messaging simultaneously intercepts and captures Feature Words;
(2)Feature Words are analyzed, obtain gene profile and formant vector;
(3)Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;
(4)Tagged word phonetic notation in Feature Words is depicted as phonetic notation sequential chart, phonetic notation collating sequence is obtained, by phonetic notation collating sequence
It is configured to Feature Words signal;
(5)By Feature Words Signal coding, sample value is formed;
(6)Sample value is compared with model storehouse, languages identification code corresponding to acquisition,
(7)Languages mark identification code is fed back into appointing system.
2. a kind of method for quickly identifying of languages according to claim 1, it is characterized in that, the phonetic notation that tagged word is formed is
Any two or more than two combinations in 21 initial consonants and 16 simple or compound vowel of a Chinese syllable.
3. a kind of method for quickly identifying of languages according to claim 1, it is characterized in that, present invention additionally comprises establish model
The step of storehouse, the step of establishing model storehouse, include:
(1)A unique corresponding languages identification code is distributed for a certain languages;
(2)Gather the voice messaging of the languages and intercept and capture Feature Words;
(3)Feature Words are analyzed, obtain gene profile and formant vector;
(4)Gene profile and formant vector are analyzed, obtains the phonetic notation that tagged word is formed;
(5)Tagged word phonetic notation is depicted as phonetic notation sequential chart, forms the Feature Words collection of illustrative plates corresponding to the languages;
(6)Phonetic notation is arranged in sequence, and by phonetic notation collating sequence construction feature word signal;
(7)Feature Words signal is encoded, forms the model value corresponding to the languages;
(8)The packing of languages identification code, Feature Words collection of illustrative plates and model value is stored as a model storehouse label;
(9)To other languages repeat steps(1)-(8)Until establishing model storehouse.
4. a kind of method for quickly identifying of languages according to claim 1, it is characterized in that, in step(6)In, work as sample value
When being mismatched with model value all in model storehouse, it is new languages to mark the languages corresponding to the voice, and new languages are used
The method write-in model storehouse of claim 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710664324.9A CN107578773A (en) | 2017-08-07 | 2017-08-07 | A kind of method for quickly identifying of languages |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710664324.9A CN107578773A (en) | 2017-08-07 | 2017-08-07 | A kind of method for quickly identifying of languages |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107578773A true CN107578773A (en) | 2018-01-12 |
Family
ID=61034358
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710664324.9A Pending CN107578773A (en) | 2017-08-07 | 2017-08-07 | A kind of method for quickly identifying of languages |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107578773A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111487898A (en) * | 2019-01-28 | 2020-08-04 | 智同科技股份有限公司 | Voice control electrical equipment with language type discrimination |
-
2017
- 2017-08-07 CN CN201710664324.9A patent/CN107578773A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111487898A (en) * | 2019-01-28 | 2020-08-04 | 智同科技股份有限公司 | Voice control electrical equipment with language type discrimination |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9852728B2 (en) | Process for improving pronunciation of proper nouns foreign to a target language text-to-speech system | |
CN110444198B (en) | Retrieval method, retrieval device, computer equipment and storage medium | |
CN104380375B (en) | Device for extracting information from a dialog | |
US8756064B2 (en) | Method and system for creating frugal speech corpus using internet resources and conventional speech corpus | |
US20190164540A1 (en) | Voice recognition system and voice recognition method for analyzing command having multiple intents | |
US7840399B2 (en) | Method, device, and computer program product for multi-lingual speech recognition | |
CN111508479B (en) | Voice recognition method, device, equipment and storage medium | |
US20140244258A1 (en) | Speech recognition method of sentence having multiple instructions | |
CN109192225B (en) | Method and device for recognizing and marking speech emotion | |
CN109785829B (en) | Customer service assisting method and system based on voice control | |
TW201337911A (en) | Electrical device and voice identification method | |
US20180350390A1 (en) | System and method for validating and correcting transcriptions of audio files | |
CN111881297A (en) | Method and device for correcting voice recognition text | |
CN111144102A (en) | Method and device for identifying entity in statement and electronic equipment | |
CN110852075A (en) | Voice transcription method and device for automatically adding punctuation marks and readable storage medium | |
KR20220130739A (en) | speech recognition | |
CN110503956B (en) | Voice recognition method, device, medium and electronic equipment | |
WO2023045186A1 (en) | Intention recognition method and apparatus, and electronic device and storage medium | |
CN111144118A (en) | Method, system, device and medium for identifying named entities in spoken text | |
WO2023272616A1 (en) | Text understanding method and system, terminal device, and storage medium | |
CN105096945A (en) | Voice recognition method and voice recognition device for terminal | |
KR20160055059A (en) | Method and apparatus for speech signal processing | |
CN107578773A (en) | A kind of method for quickly identifying of languages | |
US10600405B2 (en) | Speech signal processing method and speech signal processing apparatus | |
CN114528851A (en) | Reply statement determination method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180112 |
|
RJ01 | Rejection of invention patent application after publication |