CN106205610A - A kind of voice information identification method and equipment - Google Patents

A kind of voice information identification method and equipment Download PDF

Info

Publication number
CN106205610A
CN106205610A CN201610500446.XA CN201610500446A CN106205610A CN 106205610 A CN106205610 A CN 106205610A CN 201610500446 A CN201610500446 A CN 201610500446A CN 106205610 A CN106205610 A CN 106205610A
Authority
CN
China
Prior art keywords
vector
segmentation
identified
stream information
classification
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610500446.XA
Other languages
Chinese (zh)
Other versions
CN106205610B (en
Inventor
杨大业
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201610500446.XA priority Critical patent/CN106205610B/en
Publication of CN106205610A publication Critical patent/CN106205610A/en
Application granted granted Critical
Publication of CN106205610B publication Critical patent/CN106205610B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/065Adaptation
    • G10L15/07Adaptation to the speaker
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates

Abstract

The embodiment of the invention discloses a kind of voice information identification method, described method includes: obtain voice stream information to be identified;Described voice stream information to be identified is analyzed, extracts the primary vector that described voice stream information to be identified is corresponding;Described primary vector is carried out segmentation arrangement, obtains secondary vector;According to default principle of classification, described primary vector is carried out classification and obtain the 3rd vector;Relation between 3rd vector described in described secondary vector according to each segmentation and each classification, mates described voice stream information to be identified with user.The embodiment of the present invention is also disclosed simultaneously can a kind of voice messaging identification equipment.

Description

A kind of voice information identification method and equipment
Technical field
The present invention relates to the voice messaging identification technology in the communications field, particularly relate to a kind of voice information identification method and Equipment.
Background technology
Along with the continuous renewal of intelligent electronic device, the application of speech recognition more comes with extensive;But, daily use scene In often occur that electronic equipment can be simultaneously received the problem of voice messaging that multiple user sends, now, electronic equipment without Method coupling voice messaging is to corresponding speaker, and then does not knows should perform the operation which voice messaging is corresponding actually.
Of the prior art according to the attribute of different phonetic, voice messaging can be mated with user, but, it is used for The attribute describing voice messaging is more, if needing the voice messaging number of users that is longer and that include identified more, and amount of calculation ratio Relatively big, practical operation is got up more complicated and difficulty is relatively big, causes Consumer's Experience poor.
Summary of the invention
For solving above-mentioned technical problem, embodiment of the present invention expectation provides a kind of voice information identification method and equipment, solves Determine the problem that amount of calculation is relatively big and operation component difficulty is bigger of voice messaging identifying schemes of the prior art, reduced language The difficulty that message breath identifies, decreases amount of calculation;Meanwhile, improve the experience effect of user.
The technical scheme is that and be achieved in that:
A kind of voice information identification method, described method includes:
Obtain voice stream information to be identified;
Described voice stream information to be identified is analyzed, extract described voice stream information to be identified corresponding first to Amount;
Described primary vector is carried out segmentation arrangement, obtains secondary vector;
According to default principle of classification, described primary vector is carried out classification and obtain the 3rd vector;
Relation between 3rd vector described in described secondary vector according to each segmentation and each classification, knows described waiting Other voice stream information is mated with user.
Optionally, described described primary vector is carried out segmentation arrangement, obtain secondary vector, including:
According to the reproduction time of described voice stream information to be identified, according to described prefixed time interval by described primary vector Carry out segmentation arrangement, obtain described secondary vector.
Optionally, described according to default principle of classification, described primary vector is carried out classification obtain the 3rd vector, including:
Described primary vector is carried out principal component analysis, obtains the 4th vector;
According to described 4th vector, described primary vector is carried out taxonomic revision, obtain described 3rd vector.
Optionally, described according to described 4th vector, described primary vector is carried out taxonomic revision, obtain described three-dimensional Amount, including:
According to described 4th vector, described primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain described 3rd vector.
Optionally, the pass between the 3rd vector described in the described described secondary vector according to each segmentation and each classification System, mates described voice stream information to be identified with subscriber identity information, including:
Each described secondary vector in each segmentation is mated with each described 3rd vector in each classification;
If each described secondary vector in each segmentation mates completely with each described 3rd vector in each classification, The voice stream information to be identified that the described secondary vector in the most each segmentation is corresponding is carried out voiceprint identification, obtain institute State the identity information of user corresponding to voice stream information to be identified.
Optionally, described method also includes:
If each described secondary vector in each segmentation and incomplete of each described 3rd vector in each classification Join, then the segmentation in described secondary vector is carried out again segment processing and uses Wei Tebi algorithm that described 3rd vector is entered simultaneously Row reclassifies, until the described secondary vector in each section of segmentation again with reclassify after each described the Three vectors mate completely;
Respectively the voice stream information to be identified that the described secondary vector in each segmentation is corresponding is carried out voiceprint identification, Obtain the identity information of user corresponding to described voice stream information to be identified.
A kind of voice messaging identification equipment, described equipment includes: the first acquiring unit, second acquisition unit, the 3rd acquisition Unit and processing unit, wherein:
Described first acquiring unit, is used for obtaining voice stream information to be identified;
Described first acquiring unit, is additionally operable to be analyzed described voice stream information to be identified, extracts described to be identified The primary vector that voice stream information is corresponding;
Described second acquisition unit, for described primary vector is carried out segmentation arrangement, obtains secondary vector;
Described 3rd acquiring unit, obtains three-dimensional for described primary vector being carried out classification according to default principle of classification Amount;
Described processing unit, between according to the 3rd vector described in the described secondary vector of each segmentation and each classification Relation, described voice stream information to be identified is mated with user.
Optionally, described second acquisition unit specifically for:
According to the reproduction time of described voice stream information to be identified, according to described prefixed time interval by described primary vector Carry out segmentation arrangement, obtain described secondary vector.
Optionally, described 3rd acquiring unit includes: analyze module and the first processing module, wherein:
Described analysis module, for described primary vector is carried out principal component analysis, obtains the 4th vector;
Described first processing module, for described primary vector being carried out taxonomic revision according to described 4th vector, obtains Described 3rd vector.
Optionally, described first processing module specifically for:
According to described 4th vector, described primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain described 3rd vector.
Optionally, described processing unit includes: matching module and the second processing module, wherein:
Described matching module, each described for by each described secondary vector in each segmentation and each classification 3rd vector mates;
Described second processing module, if each with each classification of each described secondary vector in each segmentation Described 3rd vector mates completely, enters the voice stream information to be identified that the described secondary vector in each segmentation is corresponding the most respectively Row voiceprint identification, obtains the identity information of user corresponding to described voice stream information to be identified.
Optionally, described processing unit also includes: the 3rd processing module and fourth processing module, wherein:
Described 3rd processing module, if each with each classification of each described secondary vector in each segmentation Described 3rd vector Incomplete matching, then carry out again segment processing to the segmentation in described secondary vector and use Wei Tebi simultaneously Described 3rd vector is reclassified by algorithm, until the described secondary vector in each section of segmentation again with again Each described 3rd vector after classification mates completely;
Described fourth processing module, for voice flow to be identified corresponding to the described secondary vector in each segmentation respectively Information carries out voiceprint identification, obtains the identity information of user corresponding to described voice stream information to be identified.
Voice information identification method that embodiments of the invention are provided and equipment, can obtain voice flow to be identified letter Breath, and voice stream information to be identified is analyzed, extract the primary vector that voice stream information to be identified is corresponding, then by first Vector carries out segmentation arrangement, obtains secondary vector, and according to default principle of classification, primary vector being carried out classification obtains the 3rd simultaneously Vector, finally, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, believes voice flow to be identified Breath mates with user, as such, it is possible to obtain and voice flow to be identified according to the vector that voice stream information to be identified is corresponding The user of information matches, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates component difficulty relatively Big problem, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Accompanying drawing explanation
The schematic flow sheet of a kind of voice information identification method that Fig. 1 provides for embodiments of the invention;
The schematic flow sheet of the another kind of voice information identification method that Fig. 2 provides for embodiments of the invention;
The schematic flow sheet of another voice information identification method that Fig. 3 provides for embodiments of the invention;
The structural representation of a kind of voice messaging identification equipment that Fig. 4 provides for embodiments of the invention;
The structural representation of the another kind of voice messaging identification equipment that Fig. 5 provides for embodiments of the invention;
The structural representation of another the voice messaging identification equipment that Fig. 6 provides for embodiments of the invention;
The structural representation of a kind of voice messaging identification equipment that Fig. 7 provides for another embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Describe wholely.
Embodiments of the invention provide a kind of voice information identification method, and with reference to shown in Fig. 1, the method includes following step Rapid:
Step 101, obtain voice stream information to be identified.
Concrete, step 101 obtains voice stream information to be identified and can be realized by voice messaging identification equipment.Treat Identify that voice stream information can be that user inputs the voice messaging needing to carry out speech recognition to electronic equipment, need identification Voice messaging can be to be acquired by the voice collector of electronic equipment such as microphone etc..
Step 102, voice stream information to be identified is analyzed, extract voice stream information to be identified corresponding first to Amount.
Concrete, voice stream information to be identified is analyzed by step 102, extract that voice stream information to be identified is corresponding the One vector can be realized by voice messaging identification equipment;The voice stream information identified can will be needed at initial fragment bar Extracted vector standardization is carried out to obtain primary vector under part.
Step 103, primary vector is carried out segmentation arrangement, obtain secondary vector.
Concrete, primary vector is carried out segmentation arrangement by step 103, and obtaining secondary vector can be by voice messaging identification Equipment realizes.Wherein, the segmentation of primary vector can be to carry out uniform segmentation, also according to the time period pre-set Can be to carry out segmentation heterogeneous according to concrete demand.
Step 104, according to default principle of classification primary vector carried out classification obtain the 3rd vector.
Concrete, step 104 according to default principle of classification primary vector carried out classification obtain the 3rd vector can be by Voice messaging identification equipment realizes.
Step 105, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by language to be identified Sound stream information mates with user.
Concrete, step 105, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, will be treated Identify that voice stream information carries out with user mating and can be realized by voice messaging identification equipment;After segmentation can being compared Matching relationship between 3rd vector of each apoplexy due to endogenous wind after the secondary vector in each section that obtains and classification, and according to comparing Result obtains the information of user corresponding to the voice messaging needing to identify, it is achieved voice stream information to be identified is mated with user's.
The voice information identification method that embodiments of the invention are provided, can obtain voice stream information to be identified, and right Voice stream information to be identified is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then entered by primary vector Row segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector, After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Embodiments of the invention provide a kind of voice information identification method, and with reference to shown in Fig. 2, the method includes following step Rapid:
Step 201, voice messaging identification equipment obtain voice stream information to be identified.
Voice stream information to be identified is analyzed by step 202, voice messaging identification equipment, extracts voice flow to be identified letter The primary vector that breath is corresponding.
Concrete, primary vector can be to needing the voice stream information identified to carry out i vector under the conditions of initial fragment Extract and standardization obtains;The time that the voice messaging that such as can identify as required is play, and in units of every 1 second Carry out primary vector corresponding to voice messaging that initial fragment obtains needing to identify.
Step 203, voice messaging identification equipment are according to the reproduction time of voice stream information to be identified, according between Preset Time Every primary vector is carried out segmentation arrangement, obtain secondary vector.
Wherein, this prefixed time interval can be that user is according to needing broadcasting of the voice messaging identified in concrete application scenarios Put the quantity of user corresponding in the voice messaging of duration, needs identification, identify the factors such as successful ratio set in advance Individual time interval, such as, can be with one minute or five minutes constant durations as unit, the voice letter identified as required Reproduction time that breath is actual and order, the voice messaging in each minute or every five minutes durations is one section, every after segmentation The vector set of one section of vector composition needing its correspondence of voice messaging identified is secondary vector.
It should be noted that say in the present embodiment is that according to prefixed time interval, primary vector can be carried out segmentation whole Reason obtains secondary vector, equally according to different time intervals primary vector carried out segmentation arrange obtain second to Amount, concrete segmentation scheme can determine according to actual application scenarios.
Step 204, voice messaging identification equipment carry out principal component analysis to primary vector, obtain the 4th vector.
Concrete, it can be based on certain factorial analysis primary vector that primary vector carries out principal component analysis, specifically The process that realizes of principal component analysis be referred in prior art the related art scheme about principal component analysis.
Step 205, voice messaging identification equipment carry out taxonomic revision according to the 4th vector to primary vector, obtain three-dimensional Amount.
Concrete, classify primary vector can be on the basis of the 4th vector, is mapped to often by primary vector In one the 4th vector, obtain the 3rd vector according to actual mapping result classification afterwards.
Step 206, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by language to be identified Sound stream information mates with user.
It should be noted that in the present embodiment with same steps in other embodiments or the explanation of concept, be referred to it Description in its embodiment, here is omitted.
The voice information identification method that embodiments of the invention are provided, can obtain voice stream information to be identified, and right Voice stream information to be identified is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then entered by primary vector Row segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector, After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Embodiments of the invention provide a kind of voice information identification method, and with reference to shown in Fig. 3, the method includes following step Rapid:
Step 301, voice messaging identification equipment obtain voice stream information to be identified.
Voice stream information to be identified is analyzed by step 302, voice messaging identification equipment, extracts voice flow to be identified letter The primary vector that breath is corresponding.
Step 303, voice messaging identification equipment are according to the reproduction time of voice stream information to be identified, according between Preset Time Every primary vector is carried out segmentation arrangement, obtain secondary vector.
Step 304, voice messaging identification equipment carry out principal component analysis to primary vector, obtain the 4th vector.
Step 305, voice messaging identification equipment carry out variation Bayes's Gaussian Mixture according to the 4th vector to primary vector Model tying, obtains the 3rd vector.
Concrete, voice messaging identification equipment can be one coordinate system of formation on the basis of the 4th vector, by each Primary vector is mapped in the coordinate system that the 4th vector is formed, and uses variation Bayes's gauss hybrid models to cluster mapping afterwards After primary vector classify, obtain the 3rd vector.
Step 306, voice messaging identification equipment are by each with each classification of each secondary vector in each segmentation 3rd vector mates.
Concrete, all of vector in each secondary vector that segmentation can be obtained and each obtained of classifying Institute's directed quantity in 3rd vector carries out the judgement of matching degree, wherein, corresponding one the 3rd vector of secondary vector.
It should be noted that step 306 is by each secondary vector in each segmentation and each 3rd in each classification Vector can select to perform step 307 or step 308~309 according to matching result after mating, if each segmentation In each secondary vector mate completely with each 3rd vector in each classification then execution step 307, if each segmentation In each secondary vector and each classification in each 3rd vector Incomplete matching then perform step 308~309;
If each secondary vector in each segmentation of step 307 mates completely with each 3rd vector in each classification, Then voice messaging identification equipment carries out vocal print letter to the voice stream information to be identified that the secondary vector in each segmentation is corresponding respectively Breath identifies, obtains the identity information of user corresponding to voice stream information to be identified.
Concrete, if the institute's directed quantity in the secondary vector in each segmentation all with owning in the 3rd corresponding vector Vector all mates, and illustrates that each secondary vector in each segmentation mates completely with each 3rd vector in each classification, this Time it is believed that the segmentation to primary vector is more accurately, it is believed that the needs that secondary vector in a segmentation is corresponding What the voice messaging that identifies was corresponding is the voice messaging of a user, can be directly according to the vocal print feature letter of each user self Voice stream information to be identified corresponding for secondary vector is mated by breath etc. with the information of user, obtains the body of the user of its correspondence Part information.Wherein, vocal print characteristic information can include that the tone color of user, tone, tonequality, volume etc. can uniquely identify user's Characteristic information.
If each secondary vector in each segmentation of step 308 and incomplete of each 3rd vector in each classification Join, then voice messaging identification equipment carries out again segment processing and uses Wei Tebi algorithm to simultaneously the segmentation in secondary vector Three vectors reclassify, until the secondary vector in each section of segmentation again with reclassify after each the Three vectors mate completely.
Concrete, if the vector in the secondary vector in each segmentation exists with the vector in the 3rd corresponding vector not The vector of coupling, illustrates that each secondary vector in each segmentation is not complete with each 3rd vector in each classification Join, now need segmentation result is carried out segmentation again, simultaneously need to use viterbi algorithm to carry out again classification results Classification, the secondary vector after then comparing segmentation again with reclassify after the 3rd vector mate the most completely, if Yet suffer from unmatched vector, then continue the secondary vector after segmentation again is carried out segmentation, simultaneously to reclassifying it After the 3rd vector use viterbi algorithm to carry out reclassifying until again the secondary vector in after segmentation each section with Each 3rd vector after reclassifying mates completely;Again the secondary vector in each section after segmentation with reclassify After each 3rd vector afterwards mates completely, it is believed that the segmentation result that primary vector carries out segmentation again is that ratio is calibrated True, it is believed that what the voice messaging needing to identify that secondary vector in a segmentation is corresponding was corresponding is the language of a user Message ceases, can according to the vocal print characteristic information of each user self etc. by voice stream information to be identified corresponding for secondary vector with The information of user is mated, and obtains the identity information of the user of its correspondence.
The voice flow to be identified that step 309, voice messaging identification equipment are corresponding to the secondary vector in each segmentation respectively Information carries out voiceprint identification, obtains the identity information of user corresponding to voice stream information to be identified.
It should be noted that in the present embodiment with same steps in other embodiments or the explanation of concept, be referred to it Description in its embodiment, here is omitted.
The voice information identification method that embodiments of the invention provide, can obtain voice stream information to be identified, and treat Identification voice stream information is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then carried out by primary vector Segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector, After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Embodiments of the invention provide a kind of voice messaging identification equipment 4, and this voice messaging identification equipment can apply to In a kind of voice information identification method that the embodiment of Fig. 1~3 correspondences provides, with reference to shown in Fig. 4, this equipment may include that the One acquiring unit 41, second acquisition unit the 42, the 3rd acquiring unit 43 and processing unit 44, wherein:
First acquiring unit 41, is used for obtaining voice stream information to be identified.
First acquiring unit 41, is additionally operable to be analyzed voice stream information to be identified, extracts voice stream information to be identified Corresponding primary vector.
Second acquisition unit 42, for primary vector is carried out segmentation arrangement, obtains secondary vector.
3rd acquiring unit 43, obtains the 3rd vector for primary vector being carried out classification according to default principle of classification.
Processing unit 44, is used for according to the relation between secondary vector and each classification the 3rd vector of each segmentation, will Voice stream information to be identified is mated with user.
The voice messaging identification equipment that embodiments of the invention provide, can obtain voice stream information to be identified, and treat Identification voice stream information is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then carried out by primary vector Segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector, After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
Concrete, second acquisition unit 42 is specifically for performing following steps:
According to the reproduction time of voice stream information to be identified, according to prefixed time interval, primary vector is carried out segmentation whole Reason, obtains secondary vector.
Further, with reference to shown in Fig. 5, the 3rd acquiring unit 43 includes: analyze module 431 and the first processing module 432, its In:
Analyze module 431, for primary vector is carried out principal component analysis, obtain the 4th vector.
First processing module 432, for primary vector being carried out taxonomic revision according to the 4th vector, obtains the 3rd vector.
Concrete, the first processing module 432 is additionally operable to perform following steps:
According to the 4th vector, primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain the 3rd vector.
Further, with reference to shown in Fig. 6, processing unit 44 includes: matching module 441 and the second processing module 442, wherein:
Matching module 441, for by each secondary vector in each segmentation and each 3rd vector in each classification Mate.
Second processing module 442, if each secondary vector in each segmentation and each 3rd in each classification Vector mates completely, the voice stream information to be identified that the secondary vector in the most each segmentation is corresponding is carried out voiceprint knowledge , the identity information of user corresponding to voice stream information to be identified is not obtained.
Further, with reference to shown in Fig. 7, processing unit 44 also includes: the 3rd processing module 443 and fourth processing module 444, Wherein:
3rd processing module 443, if each secondary vector in each segmentation and each 3rd in each classification Vector Incomplete matching, then carry out again segment processing and use Wei Tebi algorithm to three-dimensional simultaneously the segmentation in secondary vector Amount reclassifies, until the secondary vector in each section of segmentation again with reclassify after each three-dimensional Amount is mated completely.
Fourth processing module 444, for voice stream information to be identified corresponding to the secondary vector in each segmentation respectively Carry out voiceprint identification, obtain the identity information of user corresponding to voice stream information to be identified.
It should be noted that interaction between unit and module in the embodiment of the present invention, be referred to Fig. 1~ Interaction in a kind of voice information identification method that the embodiment of 3 correspondences provides, here is omitted.
The voice messaging identification equipment that embodiments of the invention provide, can obtain voice stream information to be identified, and treat Identification voice stream information is analyzed, and extracts the primary vector that voice stream information to be identified is corresponding, is then carried out by primary vector Segmentation arranges, and obtains secondary vector, according to default principle of classification, primary vector is carried out classification simultaneously and obtains the 3rd vector, After, according to the relation between secondary vector and each classification the 3rd vector of each segmentation, by voice stream information to be identified and use Family is mated, as such, it is possible to obtain and voice stream information to be identified according to the vector that voice stream information to be identified is corresponding The user joined, the amount of calculation solving voice messaging identifying schemes of the prior art relatively greatly and operates bigger the asking of component difficulty Topic, reduces the difficulty of voice messaging identification, decreases amount of calculation;Meanwhile, improve the experience effect of user.
In actual applications, described first acquiring unit 41, second acquisition unit the 42, the 3rd acquiring unit 43, process list Unit 44, analysis module the 431, first processing module 432, matching module the 441, second processing module the 442, the 3rd processing module 443 All can be by central processing unit (the Central Processing being positioned in wireless data transmission equipment with fourth processing module 444 Unit, CPU), microprocessor (Micro Processor Unit, MPU), digital signal processor (Digital Signal Processor, DSP) or field programmable gate array (Field Programmable Gate Array, FPGA) etc. realize.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program Product.Therefore, the shape of the embodiment in terms of the present invention can use hardware embodiment, software implementation or combine software and hardware Formula.And, the present invention can use can be with storage at one or more computers wherein including computer usable program code The form of the upper computer program implemented of medium (including but not limited to disk memory and optical memory etc.).
The present invention is with reference to method, equipment (system) and the flow process of computer program according to embodiments of the present invention Figure and/or block diagram describe.It should be understood that can the most first-class by computer program instructions flowchart and/or block diagram Flow process in journey and/or square frame and flow chart and/or block diagram and/or the combination of square frame.These computer programs can be provided Instruction arrives the processor of general purpose computer, special-purpose computer, Embedded Processor or other programmable data processing device to produce A raw machine so that the instruction performed by the processor of computer or other programmable data processing device is produced for real The device of the function specified in one flow process of flow chart or multiple flow process and/or one square frame of block diagram or multiple square frame now.
These computer program instructions may be alternatively stored in and computer or other programmable data processing device can be guided with spy Determine in the computer-readable memory that mode works so that the instruction being stored in this computer-readable memory produces and includes referring to Make the manufacture of device, this command device realize at one flow process of flow chart or multiple flow process and/or one square frame of block diagram or The function specified in multiple square frames.
These computer program instructions also can be loaded in computer or other programmable data processing device so that at meter Perform sequence of operations step on calculation machine or other programmable devices to produce computer implemented process, thus at computer or The instruction performed on other programmable devices provides for realizing at one flow process of flow chart or multiple flow process and/or block diagram one The step of the function specified in individual square frame or multiple square frame.
The above, only presently preferred embodiments of the present invention, it is not intended to limit protection scope of the present invention.

Claims (12)

1. a voice information identification method, described method includes:
Obtain voice stream information to be identified;
Described voice stream information to be identified is analyzed, extracts the primary vector that described voice stream information to be identified is corresponding;
Described primary vector is carried out segmentation arrangement, obtains secondary vector;
According to default principle of classification, described primary vector is carried out classification and obtain the 3rd vector;
Relation between 3rd vector described in described secondary vector according to each segmentation and each classification, by described language to be identified Sound stream information mates with user.
Method the most according to claim 1, it is characterised in that described described primary vector is carried out segmentation arrangement, obtains Secondary vector, including:
According to the reproduction time of described voice stream information to be identified, according to described prefixed time interval, described primary vector is carried out Segmentation arranges, and obtains described secondary vector.
Method the most according to claim 1, it is characterised in that described according to default principle of classification, described primary vector is entered Row classification obtains the 3rd vector, including:
Described primary vector is carried out principal component analysis, obtains the 4th vector;
According to described 4th vector, described primary vector is carried out taxonomic revision, obtain described 3rd vector.
Method the most according to claim 3, it is characterised in that described according to described 4th vector, described primary vector is entered Row taxonomic revision, obtains described 3rd vector, including:
According to described 4th vector, described primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain the described 3rd Vector.
Method the most according to claim 1, it is characterised in that the described described secondary vector according to each segmentation is with each Relation between 3rd vector described in classification, mates described voice stream information to be identified with subscriber identity information, including:
Each described secondary vector in each segmentation is mated with each described 3rd vector in each classification;
If each described secondary vector in each segmentation mates completely with each described 3rd vector in each classification, then divide Other the voice stream information to be identified that described secondary vector in each segmentation is corresponding is carried out voiceprint identification, obtain described in treat The identity information of the user that identification voice stream information is corresponding.
Method the most according to claim 5, it is characterised in that described method also includes:
If each described secondary vector in each segmentation and each described 3rd vector Incomplete matching in each classification, then Segmentation in described secondary vector carries out again segment processing uses Wei Tebi algorithm that described 3rd vector carries out weight simultaneously New classification, until the described secondary vector in each section of segmentation again with reclassify after each described three-dimensional Amount is mated completely;
Respectively the voice stream information to be identified that the described secondary vector in each segmentation is corresponding is carried out voiceprint identification, obtain The identity information of the user that described voice stream information to be identified is corresponding.
7. a voice messaging identification equipment, it is characterised in that described equipment includes: the first acquiring unit, second acquisition unit, 3rd acquiring unit and processing unit, wherein:
Described first acquiring unit, is used for obtaining voice stream information to be identified;
Described first acquiring unit, is additionally operable to be analyzed described voice stream information to be identified, extracts described voice to be identified The primary vector that stream information is corresponding;
Described second acquisition unit, for described primary vector is carried out segmentation arrangement, obtains secondary vector;
Described 3rd acquiring unit, obtains the 3rd vector for described primary vector being carried out classification according to default principle of classification;
Described processing unit, is used for the pass between the 3rd vector described in the described secondary vector according to each segmentation and each classification System, mates described voice stream information to be identified with user.
Equipment the most according to claim 7, it is characterised in that described second acquisition unit specifically for:
According to the reproduction time of described voice stream information to be identified, according to described prefixed time interval, described primary vector is carried out Segmentation arranges, and obtains described secondary vector.
Equipment the most according to claim 7, it is characterised in that described 3rd acquiring unit includes: analyze module and first Processing module, wherein:
Described analysis module, for described primary vector is carried out principal component analysis, obtains the 4th vector;
Described first processing module, for described primary vector being carried out taxonomic revision according to described 4th vector, obtains described 3rd vector.
Equipment the most according to claim 9, it is characterised in that described first processing module specifically for:
According to described 4th vector, described primary vector is carried out variation Bayes's gauss hybrid models cluster, obtain the described 3rd Vector.
11. equipment according to claim 7, it is characterised in that described processing unit includes: matching module and second processes Module, wherein:
Described matching module, for by each described secondary vector in each segmentation and each described 3rd in each classification Vector mates;
Described second processing module, if each described with each classification of each described secondary vector in each segmentation 3rd vector mates completely, the voice stream information to be identified that the described secondary vector in the most each segmentation is corresponding is carried out sound Stricture of vagina information identification, obtains the identity information of user corresponding to described voice stream information to be identified.
12. equipment according to claim 11, it is characterised in that described processing unit also includes: the 3rd processing module and Fourth processing module, wherein:
Described 3rd processing module, if each described with each classification of each described secondary vector in each segmentation 3rd vector Incomplete matching, then carry out again segment processing to the segmentation in described secondary vector and use Wei Tebi algorithm simultaneously Described 3rd vector is reclassified, until the described secondary vector in each section of segmentation again with reclassify Each described 3rd vector afterwards mates completely;
Described fourth processing module, for voice stream information to be identified corresponding to the described secondary vector in each segmentation respectively Carry out voiceprint identification, obtain the identity information of user corresponding to described voice stream information to be identified.
CN201610500446.XA 2016-06-29 2016-06-29 A kind of voice information identification method and equipment Active CN106205610B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610500446.XA CN106205610B (en) 2016-06-29 2016-06-29 A kind of voice information identification method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610500446.XA CN106205610B (en) 2016-06-29 2016-06-29 A kind of voice information identification method and equipment

Publications (2)

Publication Number Publication Date
CN106205610A true CN106205610A (en) 2016-12-07
CN106205610B CN106205610B (en) 2019-11-26

Family

ID=57463742

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610500446.XA Active CN106205610B (en) 2016-06-29 2016-06-29 A kind of voice information identification method and equipment

Country Status (1)

Country Link
CN (1) CN106205610B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108154588A (en) * 2017-12-29 2018-06-12 深圳市艾特智能科技有限公司 Unlocking method, system, readable storage medium storing program for executing and smart machine
CN108597540A (en) * 2018-04-11 2018-09-28 南京信息工程大学 A kind of speech-emotion recognition method based on variation mode decomposition and extreme learning machine
CN109559744A (en) * 2018-12-12 2019-04-02 泰康保险集团股份有限公司 Processing method, device and the readable storage medium storing program for executing of voice data
CN110347248A (en) * 2019-06-24 2019-10-18 歌尔科技有限公司 Interaction processing method, device, equipment and audio frequency apparatus

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080235016A1 (en) * 2007-01-23 2008-09-25 Infoture, Inc. System and method for detection and analysis of speech
US7680657B2 (en) * 2006-08-15 2010-03-16 Microsoft Corporation Auto segmentation based partitioning and clustering approach to robust endpointing
CN102543063A (en) * 2011-12-07 2012-07-04 华南理工大学 Method for estimating speech speed of multiple speakers based on segmentation and clustering of speakers
CN102760434A (en) * 2012-07-09 2012-10-31 华为终端有限公司 Method for updating voiceprint feature model and terminal
CN103229233A (en) * 2010-12-10 2013-07-31 松下电器产业株式会社 Modeling device and method for speaker recognition, and speaker recognition system
CN103811020A (en) * 2014-03-05 2014-05-21 东北大学 Smart voice processing method
CN103871424A (en) * 2012-12-13 2014-06-18 上海八方视界网络科技有限公司 Online speaking people cluster analysis method based on bayesian information criterion
CN105161093A (en) * 2015-10-14 2015-12-16 科大讯飞股份有限公司 Method and system for determining the number of speakers

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7680657B2 (en) * 2006-08-15 2010-03-16 Microsoft Corporation Auto segmentation based partitioning and clustering approach to robust endpointing
US20080235016A1 (en) * 2007-01-23 2008-09-25 Infoture, Inc. System and method for detection and analysis of speech
CN103229233A (en) * 2010-12-10 2013-07-31 松下电器产业株式会社 Modeling device and method for speaker recognition, and speaker recognition system
CN102543063A (en) * 2011-12-07 2012-07-04 华南理工大学 Method for estimating speech speed of multiple speakers based on segmentation and clustering of speakers
CN102760434A (en) * 2012-07-09 2012-10-31 华为终端有限公司 Method for updating voiceprint feature model and terminal
CN103871424A (en) * 2012-12-13 2014-06-18 上海八方视界网络科技有限公司 Online speaking people cluster analysis method based on bayesian information criterion
CN103811020A (en) * 2014-03-05 2014-05-21 东北大学 Smart voice processing method
CN105161093A (en) * 2015-10-14 2015-12-16 科大讯飞股份有限公司 Method and system for determining the number of speakers

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108154588A (en) * 2017-12-29 2018-06-12 深圳市艾特智能科技有限公司 Unlocking method, system, readable storage medium storing program for executing and smart machine
CN108597540A (en) * 2018-04-11 2018-09-28 南京信息工程大学 A kind of speech-emotion recognition method based on variation mode decomposition and extreme learning machine
CN109559744A (en) * 2018-12-12 2019-04-02 泰康保险集团股份有限公司 Processing method, device and the readable storage medium storing program for executing of voice data
CN109559744B (en) * 2018-12-12 2022-07-08 泰康保险集团股份有限公司 Voice data processing method and device and readable storage medium
CN110347248A (en) * 2019-06-24 2019-10-18 歌尔科技有限公司 Interaction processing method, device, equipment and audio frequency apparatus

Also Published As

Publication number Publication date
CN106205610B (en) 2019-11-26

Similar Documents

Publication Publication Date Title
CN108922518B (en) Voice data amplification method and system
CN108962255B (en) Emotion recognition method, emotion recognition device, server and storage medium for voice conversation
WO2021068487A1 (en) Face recognition model construction method, apparatus, computer device, and storage medium
CN106104674B (en) Mixing voice identification
Wu et al. Two-level hierarchical alignment for semi-coupled HMM-based audiovisual emotion recognition with temporal course
CN106294774A (en) User individual data processing method based on dialogue service and device
CN106486126B (en) Speech recognition error correction method and device
CN106205610A (en) A kind of voice information identification method and equipment
TW201117110A (en) Behavior recognition system and recognition method by combining image and speech, and the computer
CN110148399A (en) A kind of control method of smart machine, device, equipment and medium
CN104795065A (en) Method for increasing speech recognition rate and electronic device
CN108172219A (en) The method and apparatus for identifying voice
CN109903392A (en) Augmented reality method and apparatus
CN112509583A (en) Auxiliary supervision method and system based on scheduling operation order system
CN109376363A (en) A kind of real-time voice interpretation method and device based on earphone
CN110211609A (en) A method of promoting speech recognition accuracy
CN109324515A (en) A kind of method and controlling terminal controlling intelligent electric appliance
WO2022237633A1 (en) Image processing method, apparatus, and device, and storage medium
CN111553899A (en) Audio and video based Parkinson non-contact intelligent detection method and system
CN111462762B (en) Speaker vector regularization method and device, electronic equipment and storage medium
CN114694651A (en) Intelligent terminal control method and device, electronic equipment and storage medium
CN108446403A (en) Language exercise method, apparatus, intelligent vehicle mounted terminal and storage medium
CN112489678A (en) Scene recognition method and device based on channel characteristics
CN112435672A (en) Voiceprint recognition method, device, equipment and storage medium
CN106971731B (en) Correction method for voiceprint recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant