CN106205610A - A kind of voice information identification method and equipment - Google Patents
A kind of voice information identification method and equipment Download PDFInfo
- Publication number
- CN106205610A CN106205610A CN201610500446.XA CN201610500446A CN106205610A CN 106205610 A CN106205610 A CN 106205610A CN 201610500446 A CN201610500446 A CN 201610500446A CN 106205610 A CN106205610 A CN 106205610A
- Authority
- CN
- China
- Prior art keywords
- vector
- segmentation
- identified
- stream information
- classification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
Abstract
The embodiment of the invention discloses a kind of voice information identification method, described method includes: obtain voice stream information to be identified;Described voice stream information to be identified is analyzed, extracts the primary vector that described voice stream information to be identified is corresponding;Described primary vector is carried out segmentation arrangement, obtains secondary vector;According to default principle of classification, described primary vector is carried out classification and obtain the 3rd vector;Relation between 3rd vector described in described secondary vector according to each segmentation and each classification, mates described voice stream information to be identified with user.The embodiment of the present invention is also disclosed simultaneously can a kind of voice messaging identification equipment.
Description
Technical field
The present invention relates to the voice messaging identification technology in the communications field, particularly relate to a kind of voice information identification method and
Equipment.
Background technology
Along with the continuous renewal of intelligent electronic device, the application of speech recognition more comes with extensive;But, daily use scene
In often occur that electronic equipment can be simultaneously received the problem of voice messaging that multiple user sends, now, electronic equipment without
Method coupling voice messaging is to corresponding speaker, and then does not knows should perform the operation which voice messaging is corresponding actually.
Of the prior art according to the attribute of different phonetic, voice messaging can be mated with user, but, it is used for
The attribute describing voice messaging is more, if needing the voice messaging number of users that is longer and that include identified more, and amount of calculation ratio
Relatively big, practical operation is got up more complicated and difficulty is relatively big, causes Consumer's Experience poor.
Summary of the invention
For solving above-mentioned technical problem, embodiment of the present invention expectation provides a kind of voice information identification method and equipment, solves
Determine the problem that amount of calculation is relatively big and operation component difficulty is bigger of voice messaging identifying schemes of the prior art, reduced language
The difficulty that message breath identifies, decreases amount of calculation;Meanwhile, improve the experience effect of user.
The technical scheme is that and be achieved in that:
A kind of voice information identification method, described method includes:
Obtain voice stream information to be identified;
Described voice stream information to be identified is analyzed, extract described voice stream information to be identified corresponding first to
Amount;
Described primary vector is carried out segmentation arrangement, obtains secondary vector;
According to default principle of classification, described primary vector is carried out classification and obtain the 3rd vector;
Relation between 3rd vector described in described secondary vector according to each segmentation and each classification, knows described waiting
Other voice stream information is mated with user.
Optionally, described described primary vector is carried out segmentation arrangement, obtain secondary vector, including:
According to the reproduction time of described voice stream information to be identified, according to described prefixed time interval by described primary vector
Carry out segmentation arrangement, obtain described secondary vector.
Optionally, described according to default principle of classification, described primary vector is carried out classification obtain the 3rd vector, including:
Described primary vector is carried out principal component analysis, obtains the 4th vector;
According to described 4th vector, described primary vector is carried out taxonomic revision, obtain described 3rd vector.
Optionally, described according to described 4th vector, described primary vector is carried out taxonomic revision, obtain described three-dimensional
Amount, including:
According to described 4th vector, described primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain described
3rd vector.
Optionally, the pass between the 3rd vector described in the described described secondary vector according to each segmentation and each classification
System, mates described voice stream information to be identified with subscriber identity information, including:
Each described secondary vector in each segmentation is mated with each described 3rd vector in each classification;
If each described secondary vector in each segmentation mates completely with each described 3rd vector in each classification,
The voice stream information to be identified that the described secondary vector in the most each segmentation is corresponding is carried out voiceprint identification, obtain institute
State the identity information of user corresponding to voice stream information to be identified.
Optionally, described method also includes:
If each described secondary vector in each segmentation and incomplete of each described 3rd vector in each classification
Join, then the segmentation in described secondary vector is carried out again segment processing and uses Wei Tebi algorithm that described 3rd vector is entered simultaneously
Row reclassifies, until the described secondary vector in each section of segmentation again with reclassify after each described the
Three vectors mate completely;
Respectively the voice stream information to be identified that the described secondary vector in each segmentation is corresponding is carried out voiceprint identification,
Obtain the identity information of user corresponding to described voice stream information to be identified.
A kind of voice messaging identification equipment, described equipment includes: the first acquiring unit, second acquisition unit, the 3rd acquisition
Unit and processing unit, wherein:
Described first acquiring unit, is used for obtaining voice stream information to be identified;
Described first acquiring unit, is additionally operable to be analyzed described voice stream information to be identified, extracts described to be identified
The primary vector that voice stream information is corresponding;
Described second acquisition unit, for described primary vector is carried out segmentation arrangement, obtains secondary vector;
Described 3rd acquiring unit, obtains three-dimensional for described primary vector being carried out classification according to default principle of classification
Amount;
Described processing unit, between according to the 3rd vector described in the described secondary vector of each segmentation and each classification
Relation, described voice stream information to be identified is mated with user.
Optionally, described second acquisition unit specifically for:
According to the reproduction time of described voice stream information to be identified, according to described prefixed time interval by described primary vector
Carry out segmentation arrangement, obtain described secondary vector.
Optionally, described 3rd acquiring unit includes: analyze module and the first processing module, wherein:
Described analysis module, for described primary vector is carried out principal component analysis, obtains the 4th vector;
Described first processing module, for described primary vector being carried out taxonomic revision according to described 4th vector, obtains
Described 3rd vector.
Optionally, described first processing module specifically for:
According to described 4th vector, described primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain described
3rd vector.
Optionally, described processing unit includes: matching module and the second processing module, wherein:
Described matching module, each described for by each described secondary vector in each segmentation and each classification
3rd vector mates;
Described second processing module, if each with each classification of each described secondary vector in each segmentation
Described 3rd vector mates completely, enters the voice stream information to be identified that the described secondary vector in each segmentation is corresponding the most respectively
Row voiceprint identification, obtains the identity information of user corresponding to described voice stream information to be identified.
Optionally, described processing unit also includes: the 3rd processing module and fourth processing module, wherein:
Described 3rd processing module, if each with each classification of each described secondary vector in each segmentation
Described 3rd vector Incomplete matching, then carry out again segment processing to the segmentation in described secondary vector and use Wei Tebi simultaneously
Described 3rd vector is reclassified by algorithm, until the described secondary vector in each section of segmentation again with again
Each described 3rd vector after classification mates completely;
Described fourth processing module, for voice flow to be identified corresponding to the described secondary vector in each segmentation respectively
Information carries out voiceprint identification, obtains the identity information of user corresponding to described voice stream information to be identified.
Voice information identification method that embodiments of the invention are provided and equipment, can obtain voice flow to be identified letter
Breath, and voice stream information to be identified is analyzed, extract the primary vector that voice stream information to be identified is corresponding, then by first
Vector carries out segmentation arrangement, obtains secondary vector, and according to default principle of classification, primary vector being carried out classification obtains the 3rd simultaneously
Vector, finally, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, believes voice flow to be identified
Breath mates with user, as such, it is possible to obtain and voice flow to be identified according to the vector that voice stream information to be identified is corresponding
The user of information matches, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates component difficulty relatively
Big problem, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Accompanying drawing explanation
The schematic flow sheet of a kind of voice information identification method that Fig. 1 provides for embodiments of the invention;
The schematic flow sheet of the another kind of voice information identification method that Fig. 2 provides for embodiments of the invention;
The schematic flow sheet of another voice information identification method that Fig. 3 provides for embodiments of the invention;
The structural representation of a kind of voice messaging identification equipment that Fig. 4 provides for embodiments of the invention;
The structural representation of the another kind of voice messaging identification equipment that Fig. 5 provides for embodiments of the invention;
The structural representation of another the voice messaging identification equipment that Fig. 6 provides for embodiments of the invention;
The structural representation of a kind of voice messaging identification equipment that Fig. 7 provides for another embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Describe wholely.
Embodiments of the invention provide a kind of voice information identification method, and with reference to shown in Fig. 1, the method includes following step
Rapid:
Step 101, obtain voice stream information to be identified.
Concrete, step 101 obtains voice stream information to be identified and can be realized by voice messaging identification equipment.Treat
Identify that voice stream information can be that user inputs the voice messaging needing to carry out speech recognition to electronic equipment, need identification
Voice messaging can be to be acquired by the voice collector of electronic equipment such as microphone etc..
Step 102, voice stream information to be identified is analyzed, extract voice stream information to be identified corresponding first to
Amount.
Concrete, voice stream information to be identified is analyzed by step 102, extract that voice stream information to be identified is corresponding the
One vector can be realized by voice messaging identification equipment;The voice stream information identified can will be needed at initial fragment bar
Extracted vector standardization is carried out to obtain primary vector under part.
Step 103, primary vector is carried out segmentation arrangement, obtain secondary vector.
Concrete, primary vector is carried out segmentation arrangement by step 103, and obtaining secondary vector can be by voice messaging identification
Equipment realizes.Wherein, the segmentation of primary vector can be to carry out uniform segmentation, also according to the time period pre-set
Can be to carry out segmentation heterogeneous according to concrete demand.
Step 104, according to default principle of classification primary vector carried out classification obtain the 3rd vector.
Concrete, step 104 according to default principle of classification primary vector carried out classification obtain the 3rd vector can be by
Voice messaging identification equipment realizes.
Step 105, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by language to be identified
Sound stream information mates with user.
Concrete, step 105, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, will be treated
Identify that voice stream information carries out with user mating and can be realized by voice messaging identification equipment;After segmentation can being compared
Matching relationship between 3rd vector of each apoplexy due to endogenous wind after the secondary vector in each section that obtains and classification, and according to comparing
Result obtains the information of user corresponding to the voice messaging needing to identify, it is achieved voice stream information to be identified is mated with user's.
The voice information identification method that embodiments of the invention are provided, can obtain voice stream information to be identified, and right
Voice stream information to be identified is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then entered by primary vector
Row segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector,
After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use
Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding
The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty
Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Embodiments of the invention provide a kind of voice information identification method, and with reference to shown in Fig. 2, the method includes following step
Rapid:
Step 201, voice messaging identification equipment obtain voice stream information to be identified.
Voice stream information to be identified is analyzed by step 202, voice messaging identification equipment, extracts voice flow to be identified letter
The primary vector that breath is corresponding.
Concrete, primary vector can be to needing the voice stream information identified to carry out i vector under the conditions of initial fragment
Extract and standardization obtains;The time that the voice messaging that such as can identify as required is play, and in units of every 1 second
Carry out primary vector corresponding to voice messaging that initial fragment obtains needing to identify.
Step 203, voice messaging identification equipment are according to the reproduction time of voice stream information to be identified, according between Preset Time
Every primary vector is carried out segmentation arrangement, obtain secondary vector.
Wherein, this prefixed time interval can be that user is according to needing broadcasting of the voice messaging identified in concrete application scenarios
Put the quantity of user corresponding in the voice messaging of duration, needs identification, identify the factors such as successful ratio set in advance
Individual time interval, such as, can be with one minute or five minutes constant durations as unit, the voice letter identified as required
Reproduction time that breath is actual and order, the voice messaging in each minute or every five minutes durations is one section, every after segmentation
The vector set of one section of vector composition needing its correspondence of voice messaging identified is secondary vector.
It should be noted that say in the present embodiment is that according to prefixed time interval, primary vector can be carried out segmentation whole
Reason obtains secondary vector, equally according to different time intervals primary vector carried out segmentation arrange obtain second to
Amount, concrete segmentation scheme can determine according to actual application scenarios.
Step 204, voice messaging identification equipment carry out principal component analysis to primary vector, obtain the 4th vector.
Concrete, it can be based on certain factorial analysis primary vector that primary vector carries out principal component analysis, specifically
The process that realizes of principal component analysis be referred in prior art the related art scheme about principal component analysis.
Step 205, voice messaging identification equipment carry out taxonomic revision according to the 4th vector to primary vector, obtain three-dimensional
Amount.
Concrete, classify primary vector can be on the basis of the 4th vector, is mapped to often by primary vector
In one the 4th vector, obtain the 3rd vector according to actual mapping result classification afterwards.
Step 206, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by language to be identified
Sound stream information mates with user.
It should be noted that in the present embodiment with same steps in other embodiments or the explanation of concept, be referred to it
Description in its embodiment, here is omitted.
The voice information identification method that embodiments of the invention are provided, can obtain voice stream information to be identified, and right
Voice stream information to be identified is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then entered by primary vector
Row segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector,
After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use
Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding
The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty
Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Embodiments of the invention provide a kind of voice information identification method, and with reference to shown in Fig. 3, the method includes following step
Rapid:
Step 301, voice messaging identification equipment obtain voice stream information to be identified.
Voice stream information to be identified is analyzed by step 302, voice messaging identification equipment, extracts voice flow to be identified letter
The primary vector that breath is corresponding.
Step 303, voice messaging identification equipment are according to the reproduction time of voice stream information to be identified, according between Preset Time
Every primary vector is carried out segmentation arrangement, obtain secondary vector.
Step 304, voice messaging identification equipment carry out principal component analysis to primary vector, obtain the 4th vector.
Step 305, voice messaging identification equipment carry out variation Bayes's Gaussian Mixture according to the 4th vector to primary vector
Model tying, obtains the 3rd vector.
Concrete, voice messaging identification equipment can be one coordinate system of formation on the basis of the 4th vector, by each
Primary vector is mapped in the coordinate system that the 4th vector is formed, and uses variation Bayes's gauss hybrid models to cluster mapping afterwards
After primary vector classify, obtain the 3rd vector.
Step 306, voice messaging identification equipment are by each with each classification of each secondary vector in each segmentation
3rd vector mates.
Concrete, all of vector in each secondary vector that segmentation can be obtained and each obtained of classifying
Institute's directed quantity in 3rd vector carries out the judgement of matching degree, wherein, corresponding one the 3rd vector of secondary vector.
It should be noted that step 306 is by each secondary vector in each segmentation and each 3rd in each classification
Vector can select to perform step 307 or step 308~309 according to matching result after mating, if each segmentation
In each secondary vector mate completely with each 3rd vector in each classification then execution step 307, if each segmentation
In each secondary vector and each classification in each 3rd vector Incomplete matching then perform step 308~309;
If each secondary vector in each segmentation of step 307 mates completely with each 3rd vector in each classification,
Then voice messaging identification equipment carries out vocal print letter to the voice stream information to be identified that the secondary vector in each segmentation is corresponding respectively
Breath identifies, obtains the identity information of user corresponding to voice stream information to be identified.
Concrete, if the institute's directed quantity in the secondary vector in each segmentation all with owning in the 3rd corresponding vector
Vector all mates, and illustrates that each secondary vector in each segmentation mates completely with each 3rd vector in each classification, this
Time it is believed that the segmentation to primary vector is more accurately, it is believed that the needs that secondary vector in a segmentation is corresponding
What the voice messaging that identifies was corresponding is the voice messaging of a user, can be directly according to the vocal print feature letter of each user self
Voice stream information to be identified corresponding for secondary vector is mated by breath etc. with the information of user, obtains the body of the user of its correspondence
Part information.Wherein, vocal print characteristic information can include that the tone color of user, tone, tonequality, volume etc. can uniquely identify user's
Characteristic information.
If each secondary vector in each segmentation of step 308 and incomplete of each 3rd vector in each classification
Join, then voice messaging identification equipment carries out again segment processing and uses Wei Tebi algorithm to simultaneously the segmentation in secondary vector
Three vectors reclassify, until the secondary vector in each section of segmentation again with reclassify after each the
Three vectors mate completely.
Concrete, if the vector in the secondary vector in each segmentation exists with the vector in the 3rd corresponding vector not
The vector of coupling, illustrates that each secondary vector in each segmentation is not complete with each 3rd vector in each classification
Join, now need segmentation result is carried out segmentation again, simultaneously need to use viterbi algorithm to carry out again classification results
Classification, the secondary vector after then comparing segmentation again with reclassify after the 3rd vector mate the most completely, if
Yet suffer from unmatched vector, then continue the secondary vector after segmentation again is carried out segmentation, simultaneously to reclassifying it
After the 3rd vector use viterbi algorithm to carry out reclassifying until again the secondary vector in after segmentation each section with
Each 3rd vector after reclassifying mates completely;Again the secondary vector in each section after segmentation with reclassify
After each 3rd vector afterwards mates completely, it is believed that the segmentation result that primary vector carries out segmentation again is that ratio is calibrated
True, it is believed that what the voice messaging needing to identify that secondary vector in a segmentation is corresponding was corresponding is the language of a user
Message ceases, can according to the vocal print characteristic information of each user self etc. by voice stream information to be identified corresponding for secondary vector with
The information of user is mated, and obtains the identity information of the user of its correspondence.
The voice flow to be identified that step 309, voice messaging identification equipment are corresponding to the secondary vector in each segmentation respectively
Information carries out voiceprint identification, obtains the identity information of user corresponding to voice stream information to be identified.
It should be noted that in the present embodiment with same steps in other embodiments or the explanation of concept, be referred to it
Description in its embodiment, here is omitted.
The voice information identification method that embodiments of the invention provide, can obtain voice stream information to be identified, and treat
Identification voice stream information is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then carried out by primary vector
Segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector,
After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use
Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding
The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty
Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Embodiments of the invention provide a kind of voice messaging identification equipment 4, and this voice messaging identification equipment can apply to
In a kind of voice information identification method that the embodiment of Fig. 1~3 correspondences provides, with reference to shown in Fig. 4, this equipment may include that the
One acquiring unit 41, second acquisition unit the 42, the 3rd acquiring unit 43 and processing unit 44, wherein:
First acquiring unit 41, is used for obtaining voice stream information to be identified.
First acquiring unit 41, is additionally operable to be analyzed voice stream information to be identified, extracts voice stream information to be identified
Corresponding primary vector.
Second acquisition unit 42, for primary vector is carried out segmentation arrangement, obtains secondary vector.
3rd acquiring unit 43, obtains the 3rd vector for primary vector being carried out classification according to default principle of classification.
Processing unit 44, is used for according to the relation between secondary vector and each classification the 3rd vector of each segmentation, will
Voice stream information to be identified is mated with user.
The voice messaging identification equipment that embodiments of the invention provide, can obtain voice stream information to be identified, and treat
Identification voice stream information is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then carried out by primary vector
Segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector,
After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use
Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding
The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty
Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Concrete, second acquisition unit 42 is specifically for performing following steps:
According to the reproduction time of voice stream information to be identified, according to prefixed time interval, primary vector is carried out segmentation whole
Reason, obtains secondary vector.
Further, with reference to shown in Fig. 5, the 3rd acquiring unit 43 includes: analyze module 431 and the first processing module 432, its
In:
Analyze module 431, for primary vector is carried out principal component analysis, obtain the 4th vector.
First processing module 432, for primary vector being carried out taxonomic revision according to the 4th vector, obtains the 3rd vector.
Concrete, the first processing module 432 is additionally operable to perform following steps:
According to the 4th vector, primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain the 3rd vector.
Further, with reference to shown in Fig. 6, processing unit 44 includes: matching module 441 and the second processing module 442, wherein:
Matching module 441, for by each secondary vector in each segmentation and each 3rd vector in each classification
Mate.
Second processing module 442, if each secondary vector in each segmentation and each 3rd in each classification
Vector mates completely, the voice stream information to be identified that the secondary vector in the most each segmentation is corresponding is carried out voiceprint knowledge
, the identity information of user corresponding to voice stream information to be identified is not obtained.
Further, with reference to shown in Fig. 7, processing unit 44 also includes: the 3rd processing module 443 and fourth processing module 444,
Wherein:
3rd processing module 443, if each secondary vector in each segmentation and each 3rd in each classification
Vector Incomplete matching, then carry out again segment processing and use Wei Tebi algorithm to three-dimensional simultaneously the segmentation in secondary vector
Amount reclassifies, until the secondary vector in each section of segmentation again with reclassify after each three-dimensional
Amount is mated completely.
Fourth processing module 444, for voice stream information to be identified corresponding to the secondary vector in each segmentation respectively
Carry out voiceprint identification, obtain the identity information of user corresponding to voice stream information to be identified.
It should be noted that interaction between unit and module in the embodiment of the present invention, be referred to Fig. 1~
Interaction in a kind of voice information identification method that the embodiment of 3 correspondences provides, here is omitted.
The voice messaging identification equipment that embodiments of the invention provide, can obtain voice stream information to be identified, and treat
Identification voice stream information is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then carried out by primary vector
Segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector,
After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use
Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding
The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty
Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
In actual applications, described first acquiring unit 41, second acquisition unit the 42, the 3rd acquiring unit 43, process list
Unit 44, analysis module the 431, first processing module 432, matching module the 441, second processing module the 442, the 3rd processing module 443
All can be by central processing unit (the Central Processing being positioned in wireless data transmission equipment with fourth processing module 444
Unit, CPU), microprocessor (Micro Processor Unit, MPU), digital signal processor (Digital Signal
Processor, DSP) or field programmable gate array (Field Programmable Gate Array, FPGA) etc. realize.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program
Product.Therefore, the shape of the embodiment in terms of the present invention can use hardware embodiment, software implementation or combine software and hardware
Formula.And, the present invention can use can be with storage at one or more computers wherein including computer usable program code
The form of the upper computer program implemented of medium (including but not limited to disk memory and optical memory etc.).
The present invention is with reference to method, equipment (system) and the flow process of computer program according to embodiments of the present invention
Figure and/or block diagram describe.It should be understood that can the most first-class by computer program instructions flowchart and/or block diagram
Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided
Instruction arrives the processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce
A raw machine so that the instruction performed by the processor of computer or other programmable data processing device is produced for real
The device of the function specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame now.
These computer program instructions may be alternatively stored in and computer or other programmable data processing device can be guided with spy
Determine in the computer-readable memory that mode works so that the instruction being stored in this computer-readable memory produces and includes referring to
Make the manufacture of device, this command device realize at one flow process of flow chart or multiple flow process and/or one square frame of block diagram or
The function specified in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that at meter
Perform sequence of operations step on calculation machine or other programmable devices to produce computer implemented process, thus at computer or
The instruction performed on other programmable devices provides for realizing at one flow process of flow chart or multiple flow process and/or block diagram one
The step of the function specified in individual square frame or multiple square frame.
The above, only presently preferred embodiments of the present invention, it is not intended to limit protection scope of the present invention.
Claims (12)
1. a voice information identification method, described method includes:
Obtain voice stream information to be identified;
Described voice stream information to be identified is analyzed, extracts the primary vector that described voice stream information to be identified is corresponding;
Described primary vector is carried out segmentation arrangement, obtains secondary vector;
According to default principle of classification, described primary vector is carried out classification and obtain the 3rd vector;
Relation between 3rd vector described in described secondary vector according to each segmentation and each classification, by described language to be identified
Sound stream information mates with user.
Method the most according to claim 1, it is characterised in that described described primary vector is carried out segmentation arrangement, obtains
Secondary vector, including:
According to the reproduction time of described voice stream information to be identified, according to described prefixed time interval, described primary vector is carried out
Segmentation arranges, and obtains described secondary vector.
Method the most according to claim 1, it is characterised in that described according to default principle of classification, described primary vector is entered
Row classification obtains the 3rd vector, including:
Described primary vector is carried out principal component analysis, obtains the 4th vector;
According to described 4th vector, described primary vector is carried out taxonomic revision, obtain described 3rd vector.
Method the most according to claim 3, it is characterised in that described according to described 4th vector, described primary vector is entered
Row taxonomic revision, obtains described 3rd vector, including:
According to described 4th vector, described primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain the described 3rd
Vector.
Method the most according to claim 1, it is characterised in that the described described secondary vector according to each segmentation is with each
Relation between 3rd vector described in classification, mates described voice stream information to be identified with subscriber identity information, including:
Each described secondary vector in each segmentation is mated with each described 3rd vector in each classification;
If each described secondary vector in each segmentation mates completely with each described 3rd vector in each classification, then divide
Other the voice stream information to be identified that described secondary vector in each segmentation is corresponding is carried out voiceprint identification, obtain described in treat
The identity information of the user that identification voice stream information is corresponding.
Method the most according to claim 5, it is characterised in that described method also includes:
If each described secondary vector in each segmentation and each described 3rd vector Incomplete matching in each classification, then
Segmentation in described secondary vector carries out again segment processing uses Wei Tebi algorithm that described 3rd vector carries out weight simultaneously
New classification, until the described secondary vector in each section of segmentation again with reclassify after each described three-dimensional
Amount is mated completely;
Respectively the voice stream information to be identified that the described secondary vector in each segmentation is corresponding is carried out voiceprint identification, obtain
The identity information of the user that described voice stream information to be identified is corresponding.
7. a voice messaging identification equipment, it is characterised in that described equipment includes: the first acquiring unit, second acquisition unit,
3rd acquiring unit and processing unit, wherein:
Described first acquiring unit, is used for obtaining voice stream information to be identified;
Described first acquiring unit, is additionally operable to be analyzed described voice stream information to be identified, extracts described voice to be identified
The primary vector that stream information is corresponding;
Described second acquisition unit, for described primary vector is carried out segmentation arrangement, obtains secondary vector;
Described 3rd acquiring unit, obtains the 3rd vector for described primary vector being carried out classification according to default principle of classification;
Described processing unit, is used for the pass between the 3rd vector described in the described secondary vector according to each segmentation and each classification
System, mates described voice stream information to be identified with user.
Equipment the most according to claim 7, it is characterised in that described second acquisition unit specifically for:
According to the reproduction time of described voice stream information to be identified, according to described prefixed time interval, described primary vector is carried out
Segmentation arranges, and obtains described secondary vector.
Equipment the most according to claim 7, it is characterised in that described 3rd acquiring unit includes: analyze module and first
Processing module, wherein:
Described analysis module, for described primary vector is carried out principal component analysis, obtains the 4th vector;
Described first processing module, for described primary vector being carried out taxonomic revision according to described 4th vector, obtains described
3rd vector.
Equipment the most according to claim 9, it is characterised in that described first processing module specifically for:
According to described 4th vector, described primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain the described 3rd
Vector.
11. equipment according to claim 7, it is characterised in that described processing unit includes: matching module and second processes
Module, wherein:
Described matching module, for by each described secondary vector in each segmentation and each described 3rd in each classification
Vector mates;
Described second processing module, if each described with each classification of each described secondary vector in each segmentation
3rd vector mates completely, the voice stream information to be identified that the described secondary vector in the most each segmentation is corresponding is carried out sound
Stricture of vagina information identification, obtains the identity information of user corresponding to described voice stream information to be identified.
12. equipment according to claim 11, it is characterised in that described processing unit also includes: the 3rd processing module and
Fourth processing module, wherein:
Described 3rd processing module, if each described with each classification of each described secondary vector in each segmentation
3rd vector Incomplete matching, then carry out again segment processing to the segmentation in described secondary vector and use Wei Tebi algorithm simultaneously
Described 3rd vector is reclassified, until the described secondary vector in each section of segmentation again with reclassify
Each described 3rd vector afterwards mates completely;
Described fourth processing module, for voice stream information to be identified corresponding to the described secondary vector in each segmentation respectively
Carry out voiceprint identification, obtain the identity information of user corresponding to described voice stream information to be identified.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610500446.XA CN106205610B (en) | 2016-06-29 | 2016-06-29 | A kind of voice information identification method and equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610500446.XA CN106205610B (en) | 2016-06-29 | 2016-06-29 | A kind of voice information identification method and equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106205610A true CN106205610A (en) | 2016-12-07 |
CN106205610B CN106205610B (en) | 2019-11-26 |
Family
ID=57463742
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610500446.XA Active CN106205610B (en) | 2016-06-29 | 2016-06-29 | A kind of voice information identification method and equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106205610B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108154588A (en) * | 2017-12-29 | 2018-06-12 | 深圳市艾特智能科技有限公司 | Unlocking method, system, readable storage medium storing program for executing and smart machine |
CN108597540A (en) * | 2018-04-11 | 2018-09-28 | 南京信息工程大学 | A kind of speech-emotion recognition method based on variation mode decomposition and extreme learning machine |
CN109559744A (en) * | 2018-12-12 | 2019-04-02 | 泰康保险集团股份有限公司 | Processing method, device and the readable storage medium storing program for executing of voice data |
CN110347248A (en) * | 2019-06-24 | 2019-10-18 | 歌尔科技有限公司 | Interaction processing method, device, equipment and audio frequency apparatus |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080235016A1 (en) * | 2007-01-23 | 2008-09-25 | Infoture, Inc. | System and method for detection and analysis of speech |
US7680657B2 (en) * | 2006-08-15 | 2010-03-16 | Microsoft Corporation | Auto segmentation based partitioning and clustering approach to robust endpointing |
CN102543063A (en) * | 2011-12-07 | 2012-07-04 | 华南理工大学 | Method for estimating speech speed of multiple speakers based on segmentation and clustering of speakers |
CN102760434A (en) * | 2012-07-09 | 2012-10-31 | 华为终端有限公司 | Method for updating voiceprint feature model and terminal |
CN103229233A (en) * | 2010-12-10 | 2013-07-31 | 松下电器产业株式会社 | Modeling device and method for speaker recognition, and speaker recognition system |
CN103811020A (en) * | 2014-03-05 | 2014-05-21 | 东北大学 | Smart voice processing method |
CN103871424A (en) * | 2012-12-13 | 2014-06-18 | 上海八方视界网络科技有限公司 | Online speaking people cluster analysis method based on bayesian information criterion |
CN105161093A (en) * | 2015-10-14 | 2015-12-16 | 科大讯飞股份有限公司 | Method and system for determining the number of speakers |
-
2016
- 2016-06-29 CN CN201610500446.XA patent/CN106205610B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7680657B2 (en) * | 2006-08-15 | 2010-03-16 | Microsoft Corporation | Auto segmentation based partitioning and clustering approach to robust endpointing |
US20080235016A1 (en) * | 2007-01-23 | 2008-09-25 | Infoture, Inc. | System and method for detection and analysis of speech |
CN103229233A (en) * | 2010-12-10 | 2013-07-31 | 松下电器产业株式会社 | Modeling device and method for speaker recognition, and speaker recognition system |
CN102543063A (en) * | 2011-12-07 | 2012-07-04 | 华南理工大学 | Method for estimating speech speed of multiple speakers based on segmentation and clustering of speakers |
CN102760434A (en) * | 2012-07-09 | 2012-10-31 | 华为终端有限公司 | Method for updating voiceprint feature model and terminal |
CN103871424A (en) * | 2012-12-13 | 2014-06-18 | 上海八方视界网络科技有限公司 | Online speaking people cluster analysis method based on bayesian information criterion |
CN103811020A (en) * | 2014-03-05 | 2014-05-21 | 东北大学 | Smart voice processing method |
CN105161093A (en) * | 2015-10-14 | 2015-12-16 | 科大讯飞股份有限公司 | Method and system for determining the number of speakers |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108154588A (en) * | 2017-12-29 | 2018-06-12 | 深圳市艾特智能科技有限公司 | Unlocking method, system, readable storage medium storing program for executing and smart machine |
CN108597540A (en) * | 2018-04-11 | 2018-09-28 | 南京信息工程大学 | A kind of speech-emotion recognition method based on variation mode decomposition and extreme learning machine |
CN109559744A (en) * | 2018-12-12 | 2019-04-02 | 泰康保险集团股份有限公司 | Processing method, device and the readable storage medium storing program for executing of voice data |
CN109559744B (en) * | 2018-12-12 | 2022-07-08 | 泰康保险集团股份有限公司 | Voice data processing method and device and readable storage medium |
CN110347248A (en) * | 2019-06-24 | 2019-10-18 | 歌尔科技有限公司 | Interaction processing method, device, equipment and audio frequency apparatus |
Also Published As
Publication number | Publication date |
---|---|
CN106205610B (en) | 2019-11-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108922518B (en) | Voice data amplification method and system | |
CN108962255B (en) | Emotion recognition method, emotion recognition device, server and storage medium for voice conversation | |
WO2021068487A1 (en) | Face recognition model construction method, apparatus, computer device, and storage medium | |
CN106104674B (en) | Mixing voice identification | |
Wu et al. | Two-level hierarchical alignment for semi-coupled HMM-based audiovisual emotion recognition with temporal course | |
CN106294774A (en) | User individual data processing method based on dialogue service and device | |
CN106486126B (en) | Speech recognition error correction method and device | |
CN106205610A (en) | A kind of voice information identification method and equipment | |
TW201117110A (en) | Behavior recognition system and recognition method by combining image and speech, and the computer | |
CN110148399A (en) | A kind of control method of smart machine, device, equipment and medium | |
CN104795065A (en) | Method for increasing speech recognition rate and electronic device | |
CN108172219A (en) | The method and apparatus for identifying voice | |
CN109903392A (en) | Augmented reality method and apparatus | |
CN112509583A (en) | Auxiliary supervision method and system based on scheduling operation order system | |
CN109376363A (en) | A kind of real-time voice interpretation method and device based on earphone | |
CN110211609A (en) | A method of promoting speech recognition accuracy | |
CN109324515A (en) | A kind of method and controlling terminal controlling intelligent electric appliance | |
WO2022237633A1 (en) | Image processing method, apparatus, and device, and storage medium | |
CN111553899A (en) | Audio and video based Parkinson non-contact intelligent detection method and system | |
CN111462762B (en) | Speaker vector regularization method and device, electronic equipment and storage medium | |
CN114694651A (en) | Intelligent terminal control method and device, electronic equipment and storage medium | |
CN108446403A (en) | Language exercise method, apparatus, intelligent vehicle mounted terminal and storage medium | |
CN112489678A (en) | Scene recognition method and device based on channel characteristics | |
CN112435672A (en) | Voiceprint recognition method, device, equipment and storage medium | |
CN106971731B (en) | Correction method for voiceprint recognition |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |