CN106971730A - A kind of method for recognizing sound-groove based on channel compensation - Google Patents
A kind of method for recognizing sound-groove based on channel compensation Download PDFInfo
- Publication number
- CN106971730A CN106971730A CN201610025193.5A CN201610025193A CN106971730A CN 106971730 A CN106971730 A CN 106971730A CN 201610025193 A CN201610025193 A CN 201610025193A CN 106971730 A CN106971730 A CN 106971730A
- Authority
- CN
- China
- Prior art keywords
- frequency range
- sequence number
- data group
- sequence
- identification feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012360 testing method Methods 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 30
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 8
- 238000013144 data compression Methods 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 9
- 230000002123 temporal effect Effects 0.000 claims description 7
- 238000012790 confirmation Methods 0.000 claims description 6
- 230000001755 vocal effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- TZCXTZWJZNENPQ-UHFFFAOYSA-L barium sulfate Chemical compound [Ba+2].[O-]S([O-])(=O)=O TZCXTZWJZNENPQ-UHFFFAOYSA-L 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/06—Decision making techniques; Pattern matching strategies
- G10L17/08—Use of distortion metrics or a particular distance between probe pattern and reference templates
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Business, Economics & Management (AREA)
- Game Theory and Decision Science (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a kind of method for recognizing sound-groove based on channel compensation, belong to technical field of biometric identification;Method includes:Receive the sound source of outside input;The sound source is converted to the voice of standard according to default compensation model using channel compensating method;The voice is fitted with first frequency range and second frequency range respectively;Then the two frequency ranges are proceeded as follows respectively:Voice is divided into multiple identification sections;Each identification section is done characteristic point is identified after eigentransformation, and and then formation identification feature space;Identification feature space is divided into many sub-spaces;Done according to training sentence and time sequence characteristic point is obtained after eigentransformation and is dispensed into each sub-spaces, First ray, and and then formation training identification feature are formed according to the sequence number of subspace.Similarly, obtained testing identification feature according to test statement;Finally test identification feature and training identification feature are contrasted, and the result for obtaining Application on Voiceprint Recognition is handled according to comparing result.
Description
Technical field
The present invention relates to technical field of biometric identification, more particularly to a kind of Application on Voiceprint Recognition based on channel compensation
Method.
Background technology
As Application on Voiceprint Recognition and fingerprint, iris, recognition of face etc., belong to one kind of bio-identification, recognized
To be most natural living things feature recognition identity authentication mode.Can be easily to saying by Application on Voiceprint Recognition
The identity of words people is verified, and the privacy of this verification mode is very high, because the usual nothing of vocal print
Method and is stolen at fraudulent copying, thus Application on Voiceprint Recognition have in various fields especially smart machine field it is prominent
The application advantage gone out.
The basic process of Application on Voiceprint Recognition is voice collecting, feature extraction, disaggregated model.Common voice is special
It is the short-term stationarity characteristic using voice to levy extracting method, is converted speech into using U.S. Cepstrum Transform method
Identification feature collection, is modeled the classification mould for obtaining speaker to speaker's voice by learning process afterwards
Type, then obtains the result of Application on Voiceprint Recognition by all kinds of identification models.But said process exist it is following several
Individual problem:(1) model of above-mentioned Application on Voiceprint Recognition needs to learn more samples to apply;(2) foundation
The complexity of the calculating for the Application on Voiceprint Recognition that above-mentioned identification model is carried out is higher;(3) according to above-mentioned identification mould
It is larger that type calculates obtained model data amount;(4) due to transmit sound source channel be it is variable, such as:Sound
Source for recording when, to voice carry out Application on Voiceprint Recognition when can due to recording noise the problems such as cause voice to lose
Very, so as to cause the accuracy of Application on Voiceprint Recognition to be greatly reduced.In summary, for the intelligence of resource-constrained
For system, the above-mentioned application that voiceprint recognition algorithm of the prior art is limited the problem of both deposit.
The content of the invention
According to the above-mentioned problems in the prior art, a kind of Application on Voiceprint Recognition based on channel compensation is now provided
The technical scheme of method, is specifically included:
A kind of method for recognizing sound-groove based on channel compensation, wherein:Default one first frequency range and one second frequency
Section, first frequency range is higher than second frequency range, comprises the steps:
Step S1, receives the sound source of outside input;
Step S2, standard is converted to according to default compensation model using channel compensating method by the sound source
Voice;
Step S3, the voice is fitted with first frequency range and second frequency range respectively;
Step S4, the voice being respectively under first frequency range or second frequency range is split
For the identification section of length-specific;
Step S5, does to each identification section and corresponding multiple identification features is obtained after eigentransformation,
And respectively constitute correspondence described first using all identification features for being associated with all identification sections
The identification feature space of frequency range, or correspond to the identification feature space of second frequency range;
Step S6, plural sub-spaces are divided into by the identification feature space, and each with description information
The subspace being divided, and assign a corresponding sequence number to each subspace respectively;
Step S7, will be associated with training in first frequency range or in second frequency range respectively
Every training sentence of model is done to be obtained including the time sequence characteristic point of corresponding time sequence characteristic point after eigentransformation
Collection, each described subspace that each time sequence characteristic point is respectively allocated under same frequency range, according to every
The sequence number of the corresponding subspace of the individual time sequence characteristic point formed respectively be associated with first frequency range or
The First ray of second frequency range described in person, and and then the corresponding training identification feature of formation;
Step S8, will be associated with test in first frequency range or in second frequency range respectively
Every test statement of model, which is done, obtains the temporal aspect point set after eigentransformation, each sequential is special
Levy and be a little respectively allocated into subspace each described, according to the corresponding son of each time sequence characteristic point
The sequence number in space forms the second sequence for being associated with first frequency range or second frequency range respectively, and
And then form corresponding test identification feature;
Step S9, contrast is associated with the training identification feature of first frequency range and the test is recognized
Whether feature is similar, and the confirmation knot for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparing result
Really, or
For whether being associated with the training identification feature of second frequency range and the test identification feature
It is similar, and the confirmation result for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparing result.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein, in the step S7, each
The time sequence characteristic point is dispensed into each described subspace according to nearest neighbouring rule.
It is preferred that, should method for recognizing sound-groove based on channel compensation, wherein, will be by the step S7
Each the described subspace for being dispensed into the time sequence characteristic point constitutes a spatial sequence according to the sequence number, and
Using the spatial sequence as the First ray, to form the training identification feature.
It is preferred that, should method for recognizing sound-groove based on channel compensation, wherein, will be by the step S8
Each the described subspace for being dispensed into the time sequence characteristic point constitutes a spatial sequence according to the sequence number, and
Using the control sequence as second sequence, to form the test identification feature.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein, it is described in the step S7
Spatial sequence includes being associated with the data group of each subspace, a data group correspondence one
The sequence number;
After the spatial sequence is formed, in addition to respectively in first frequency range or second frequency
The process for the first data compression that the spatial sequence of section is carried out, be specially:
Step S71, the sequence number of each data group of record, and record is associated with each sequence number
Repetition sequence number quantity;
Step S72, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing
Step S73 is turned to when stating the data group that repetition sequence number quantity is 1;
Step S73, it is the 1 corresponding data group of the sequence number to delete the sequence number quantity that repeats;
Step S74, judge the deleted data group previous data group the sequence number whether with quilt
The sequence number of latter data group of the data group deleted is identical:
If identical, the previous data group and the latter data are combined simultaneously;
If differing, retain the previous data group and the latter data group;
Institute is formed after being performed both by first data compression to all data groups in the spatial sequence
State First ray.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein, it is described in the step S8
Spatial sequence includes being associated with the data group of each subspace, a data group correspondence one
The sequence number;
After the spatial sequence is formed, in addition to respectively in first frequency range or second frequency
The process for the second data compression that the spatial sequence of section is carried out, be specially:
Step S81, the sequence number of each data group of record, and record is associated with each sequence number
Repetition sequence number quantity;
Step S82, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing
Step S83 is turned to when stating the data group that repetition sequence number quantity is 1;
Step S83, it is the 1 corresponding data group of the sequence number to delete the sequence number quantity that repeats;
Step S84, judge the deleted data group previous data group the sequence number whether with quilt
The sequence number of latter data group of the data group deleted is identical:
If identical, the previous data group and the latter data are combined simultaneously;
If differing, retain the previous data group and the latter data group;
Institute is formed after being performed both by second data compression to all data groups in the spatial sequence
State the second sequence.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein:The eigentransformation is U.S. cepstrum
Conversion.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein:In the execution U.S. Cepstrum Transform
During, every sentence is divided into the frames of 20ms mono- respectively, and 10ms frame is pipetted into out pass
It is coupled to the sentence frame of the sentence;
Then, remove Jing Yin in units of frame, help every frame after Cepstrum Transform to stay 12 to the sentence frame
Coefficient, and constituted the identification feature with 12 coefficients.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein:In the step S6, adopt
Identification feature space is divided into several subspaces with " K- averages " algorithm, each son after division is empty
Between the description information of the correspondence subspace is recorded as with the central point of " K- averages " respectively.
The beneficial effect of above-mentioned technical proposal is:A kind of method for recognizing sound-groove based on channel compensation is provided,
Channel compensation first is carried out to sound source before voice is identified, sound source is converted to the voice of standard,
So as to ensure the accuracy of Application on Voiceprint Recognition, make the amount of calculation of Application on Voiceprint Recognition smaller, storage can be saved and counted
Resource is calculated, and overcomes the problem of modeling method based on probability statistics is present, is suitable for system resource
Limited intelligence system is used.The first frequency and table for representing the speaker of children are pre-set simultaneously
It is shown as the second frequency of the speaker in year and is compared respectively, further improves based on channel compensation
The degree of accuracy of Application on Voiceprint Recognition.
Brief description of the drawings
Fig. 1 be the present invention preferred embodiment in, a kind of method for recognizing sound-groove based on channel compensation
Overview flow chart;
Fig. 2 be the present invention preferred embodiment in, the schematic flow sheet of the first data compression;
Fig. 3 be the present invention preferred embodiment in, the schematic flow sheet of the second data compression.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out
Clearly and completely describe, it is clear that described embodiment is only a part of embodiment of the invention, and
The embodiment being not all of.Based on the embodiment in the present invention, those of ordinary skill in the art are not making
The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
It should be noted that in the case where not conflicting, the embodiment in the present invention and the spy in embodiment
Levying to be mutually combined.
The invention will be further described with specific embodiment below in conjunction with the accompanying drawings, but not as the present invention's
Limit.
In the preferred embodiment of the present invention, based on the above-mentioned problems in the prior art, one is now provided
Plant the method for recognizing sound-groove based on channel compensation.The method for recognizing sound-groove based on channel compensation can be applicable
In the smart machine with voice control function, such as the intelligent robot in applied to personal air.
In the above-mentioned method for recognizing sound-groove based on channel compensation, one first frequency range and one the are preset first
Two frequency ranges, first frequency range is higher than second frequency range.Specifically, for different users,
The frequency of its voice may be different, and progress division rough to frequency can be divided into the speaker's of correspondence adult
Relatively low frequency range, and correspond to the higher frequency range of the speaker of children.
Further, for the speaker of adult and the speaker of children, it is based on channel compensation
Application on Voiceprint Recognition may and differ, be characterized in particular in the extraction of its vocal print feature and corresponding sound-groove model
Structure might have difference.Therefore in technical solution of the present invention, the frequency range of two phonetic inceptings is set,
And recognized the voice of adult and the speech differentiation of children according to the two frequency ranges, so as to further be lifted
Accuracy of identification.In other words, the first the above frequency range can be used to indicate that the language of the speaker of children
Audio section, the second frequency range can be used to indicate that the voice band of the speaker of adult.Therefore, it is of the invention
In preferred embodiment, above-mentioned two frequency range can accordingly be changed according to the constantly cumulative of experimental data,
So as to reach that can accurately represent the voice band of adult speaker and children speaker respectively
Purpose.
Then in preferred embodiment of the invention, as shown in figure 1, the above-mentioned vocal print based on channel compensation is known
Other method specifically includes following step:
Step S1, receives the sound source of outside input;
Step S2, is converted to sound source using channel compensating method according to default compensation model the language of standard
Sound;
Step S3, voice is fitted with the first frequency range and the second frequency range respectively;
Step S4, length-specific is divided into by the voice being respectively under the first frequency range or the second frequency range
Identification section;
Step S5, does to each identification section and corresponding multiple identification features is obtained after eigentransformation, and adopt
The identification feature for respectively constituting the first frequency range of correspondence with all identification features for being associated with all identification sections is empty
Between, or correspond to the identification feature space of the second frequency range;
Step S6, is divided into plural sub-spaces, and each drawn with description information by identification feature space
The subspace divided, and assign a corresponding sequence number to every sub-spaces respectively;
Step S7, will be associated with the every of training pattern in the first frequency range or in the second frequency range respectively
Bar training sentence, which is done, obtains the temporal aspect point set for including corresponding time sequence characteristic point after eigentransformation, each
Time sequence characteristic point is respectively allocated each sub-spaces under same frequency range, according to each time sequence characteristic point correspondence
The sequence number of subspace form the First ray for being associated with the first frequency range or the second frequency range respectively, and and then
Form corresponding training identification feature;
Step S8, will be associated with the every of test model in the first frequency range or in the second frequency range respectively
Bar test statement does and temporal aspect point set is obtained after eigentransformation, each time sequence characteristic point be respectively allocated into
Each sub-spaces, form and are associated with first respectively according to the sequence number of the corresponding subspace of each time sequence characteristic point
Second sequence of frequency range or the second frequency range, and and then the corresponding test identification feature of formation;
Step S9, contrast be associated with the training identification feature of the first frequency range with test identification feature whether phase
Seemingly, the confirmation result for obtaining the Application on Voiceprint Recognition based on channel compensation and is handled according to comparing result, or
It is whether similar to testing identification feature for being associated with the training identification feature of the second frequency range, and according to
Comparing result processing obtains the confirmation result of the Application on Voiceprint Recognition based on channel compensation.
In the present embodiment, method for recognizing sound-groove first carries out channel before voice is identified to sound source
Compensation, sound source is converted to the voice of standard, so as to ensure the accuracy of Application on Voiceprint Recognition, with avoid because
Sound source for recording or the external world the cause influence Application on Voiceprint Recognition result such as noise is excessive the problem of.
In the preferred embodiment of the present invention, on the basis of above-mentioned pre-set, above-mentioned steps S4-S5
In, obtain first be respectively under the first frequency range or the second frequency range based on different background, different voice
Voice, and these voices are divided into the identification section of length-specific.Specifically, can be by the different back ofs the body
Scape, the corresponding every sentence of voice of different voice are divided into multiple sentence frames by a frame of 20ms,
And pipette 10ms sentence frame, then remove Jing Yin in units of every frame, cepstrum is helped to speech frame
Conversion, 12 coefficients are stayed per frame, and 12 coefficients are to constitute identification feature.The identification of all voice segments
Feature constitutes identification feature collection, that is, constitutes corresponding identification feature space.
In the preferred embodiment of the present invention, in above-mentioned steps S6, it will be recognized using " K- averages " algorithm
Feature space is divided into plural sub-spaces, and several subspaces after division are respectively with the center of " K- averages "
Point is recorded as the data description of the subspace, and each sub-spaces are numbered, and record is per sub-spaces
Description information sequence number corresponding with its.Above-mentioned steps are same under the first frequency range or the second frequency range
Identification feature space perform respectively.
It is empty to the son under the first frequency range or the second frequency range respectively in the preferred embodiment of the present invention
Between carry out as above-mentioned step S7 operation:Every training sentence for being associated with training pattern is done into feature change
Obtain including the temporal aspect point set of corresponding time sequence characteristic point after changing, each time sequence characteristic point is divided respectively
Supplying is distinguished with each sub-spaces under frequency range according to the sequence number of the corresponding subspace of each time sequence characteristic point
Form the First ray for being associated with the first frequency range or the second frequency range, and and then the corresponding training identification of formation
Feature.
Specifically, in preferred embodiment of the invention, so-called training sentence can be by instructing repeatedly
The part for the training pattern that reference is carried out when internal system is compared for system is defaulted in after white silk.
Specifically, in preferred embodiment of the invention, in above-mentioned steps S7, by each temporal aspect
Point is dispensed under same frequency range (the first frequency range or the second frequency range) respectively according to nearest neighbouring rule
In each sub-spaces, and the sequence number of the corresponding subspace of each time sequence characteristic point is recorded, ultimately form one
Individual First ray, the First ray is made up of the sequence number of different subspaces, for example (2,2,4,8,8,
8th, 5,5,5,5,5), and then corresponding training identification feature is formed according to the First ray.
In the preferred embodiment of the present invention, similarly, in above-mentioned steps S8, respectively in above-mentioned
Subspace under first frequency range or the second frequency range is proceeded as follows:Test to being associated with test model
Sentence is done and temporal aspect point set is obtained after eigentransformation, and each time sequence characteristic point is respectively allocated into each height
Space, formed respectively according to the sequence number of the corresponding subspace of each time sequence characteristic point be associated with the first frequency range or
Second sequence of the frequency range of person second, and and then the corresponding test identification feature of formation.
In the preferred embodiment of the present invention, so-called test statement, it is associated with test model, that is,
Need the sentence compared.
Specifically, in preferred embodiment of the invention, in above-mentioned steps S8, equally by above-mentioned test
Each time sequence characteristic point in sentence is dispensed into (first under same frequency range respectively according to nearest neighbouring rule
Frequency range or the second frequency range) each sub-spaces in, and it is empty to record the corresponding son of each time sequence characteristic point
Between sequence number, ultimately form second sequence, the same sequence number by different subspaces of second sequence
Composition, such as (2,3,3,5,5,8,6,6,6,4,4), and then according to the second sequence shape
Into corresponding test identification feature.In the preferred embodiment of the present invention, above-mentioned steps S7 and step S8
Between and in the absence of the relation that mutually depends on, (i.e. step S8 execution is necessarily finished with step S7
Premised on), therefore above-mentioned steps S7 and step S8 can carry out simultaneously.Step is still shown in Fig. 1
The embodiment that S7 and step S8 orders are carried out.
In the preferred embodiment of the present invention, in above-mentioned steps S9, the training of above-mentioned formation is recognized special
Test identification feature of seeking peace is compared, and obtains the vocal print based on channel compensation according to comparison result processing
The final result of identification.
Specifically, in above-mentioned steps S9, equally compared respectively in accordance with the first frequency range and the second frequency range
It is right, i.e., by the test identification feature under the first frequency range and the training identification feature being similarly under the first frequency range
It is compared, and the result for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparison result.Equally
Ground, the test identification feature under the second frequency range and the training identification feature that is similarly under the second frequency range are entered
Row is compared, and the result for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparison result.
Further, in preferred embodiment of the invention, in above-mentioned steps S7, wrapped in spatial sequence
Include the data group for being associated with every sub-spaces, data group one sequence number of correspondence;
Then after spatial sequence is formed, in addition to respectively to the space in the first frequency range or the second frequency range
The process for the first data compression that sequence is carried out, specifically as shown in Fig. 2 being:
Step S71, records the sequence number of each data group, and record the repetition sequence number for being associated with each sequence number
Quantity;
Step S72, the repetition sequence number quantity for judging whether sequence number is 1, and repeats sequence number existing
Step S73 is turned to when quantity is 1 data group;
Step S73, deletes the corresponding data group of sequence number for repeating that sequence number quantity is 1;
Step S74, judge deleted data group previous data group sequence number whether with it is deleted
The sequence number of latter data group of data group is identical:
If identical, previous data group and latter data are combined simultaneously;
If differing, retain previous data group and latter data group;
First ray is formed after being performed both by the first data compression to all data groups in spatial sequence.
Specifically, in preferred embodiment of the invention, during above-mentioned first data compression, record
The sequence number of subspace and the quantity of same sequence number, regard the quantity of sequence number and same sequence number as one group of data
Arranged, when the quantity of same sequence number is 1, remove this group of data.In the foot stool of the present invention
Embodiment in, the data of serial number 4 only have 1, then carry out the first data compression during delete
Fall this group of data.
If after this group of data were removed, sequence number and one group of rear data in front of the data in one group of data
In sequence number it is identical when, then by two combination simultaneously.The sequence number of the data group newly formed and the deleted data
The sequence number of the one group of data in front of group is identical, and the quantity of same sequence number is in front of this group of deleted data one
The quantity sum of the quantity and deleted one group of this group of data rear data of group data.Or, deleting
After this group of data, the sequence number in front of the data in one group of data is different with the sequence number in the data of one group of rear,
Then retain this two groups of data simultaneously.For example, in the preferred embodiment of the present invention, working as serial number
After 4 data group is removed, positioned at the serial number 2 of the data of this group of data previous group, positioned at this group of data
The serial number 8,2 and 8 of the data of later group is differed, so retaining former data group.
In the preferred embodiment of the present invention, the First ray after the first data compression is above-mentioned instruction
Practice identification feature.
Correspondingly, in preferred embodiment of the invention, in above-mentioned steps S8, spatial sequence includes
It is associated with the data group of every sub-spaces, data group one sequence number of correspondence;
Then after spatial sequence is formed, in addition to respectively to the space in the first frequency range or the second frequency range
The process for the second data compression that sequence is carried out, specifically as shown in figure 3, being:
Step S81, records the sequence number of each data group, and record the repetition sequence number for being associated with each sequence number
Quantity;
Step S82, the repetition sequence number quantity for judging whether sequence number is 1, and repeats sequence number existing
Step S83 is turned to when quantity is 1 data group;
Step S83, deletes the corresponding data group of sequence number for repeating that sequence number quantity is 1;
Step S84, judge deleted data group previous data group sequence number whether with it is deleted
The sequence number of latter data group of data group is identical:
If identical, previous data group and latter data are combined simultaneously;
If differing, retain previous data group and latter data group;
All data groups in spatial sequence are performed both by after the second data compression forming the second sequence.
Specifically, in the step in similar above-mentioned steps S7, step S8, the sequence of same record subspace
Number and same sequence number quantity, arranged the quantity of sequence number and same sequence number as one group of data.
When the quantity of same sequence number is 1, remove this group of data.
If after this group of data were removed, sequence number and one group of rear data in front of the data in one group of data
In sequence number it is identical when, then by two combination simultaneously.The sequence number of the data group newly formed and the deleted data
The sequence number of the one group of data in front of group is identical, and the quantity of same sequence number is in front of this group of deleted data one
The quantity sum of the quantity and deleted one group of this group of data rear data of group data.Or, deleting
After this group of data, the sequence number in front of the data in one group of data is different with the sequence number in the data of one group of rear,
Then retain this two groups of data simultaneously.For example, in the preferred embodiment of the present invention, working as serial number
After 4 data group is removed, positioned at the serial number 2 of the data of this group of data previous group, positioned at this group of data
The serial number 8,2 and 8 of the data of later group is differed, so retaining former data group.
Similarly, in preferred embodiment of the invention, above-mentioned the second sequence Jing Guo the second data compression
As test identification feature.
In above-mentioned steps S9, eventually through same frequency range (the first frequency range or the second frequency range) will be in
Under training identification feature and test identification feature be compared, and handled according to comparison result and obtain final
The Application on Voiceprint Recognition based on channel compensation result.
The execution of above-mentioned steps make it that the amount of calculation of the Application on Voiceprint Recognition based on channel compensation is smaller, and discrimination is more
It is good, and need data volume to be processed also relatively small.
The foregoing is only preferred embodiments of the present invention, not thereby limit embodiments of the present invention and
Protection domain, to those skilled in the art, should can appreciate that all utilization description of the invention
And the equivalent substitution made by diagramatic content and the scheme obtained by obvious change, it should include
Within the scope of the present invention.
Claims (9)
1. a kind of method for recognizing sound-groove based on channel compensation, it is characterised in that:Default one first frequency range with
And one second frequency range, first frequency range is higher than second frequency range, comprises the steps:
Step S1, receives the sound source of outside input;
Step S2, standard is converted to according to default compensation model using channel compensating method by the sound source
Voice;
Step S3, the voice is fitted with first frequency range and second frequency range respectively;
Step S4, the voice being respectively under first frequency range or second frequency range is split
For the identification section of length-specific;
Step S5, does to each identification section and corresponding multiple identification features is obtained after eigentransformation,
And respectively constitute correspondence described first using all identification features for being associated with all identification sections
The identification feature space of frequency range, or correspond to the identification feature space of second frequency range;
Step S6, plural sub-spaces are divided into by the identification feature space, and each with description information
The subspace being divided, and assign a corresponding sequence number to each subspace respectively;
Step S7, will be associated with training in first frequency range or in second frequency range respectively
Every training sentence of model is done to be obtained including the time sequence characteristic point of corresponding time sequence characteristic point after eigentransformation
Collection, each described subspace that each time sequence characteristic point is respectively allocated under same frequency range, according to every
The sequence number of the corresponding subspace of the individual time sequence characteristic point formed respectively be associated with first frequency range or
The First ray of second frequency range described in person, and and then the corresponding training identification feature of formation;
Step S8, will be associated with test in first frequency range or in second frequency range respectively
Every test statement of model, which is done, obtains the temporal aspect point set after eigentransformation, each sequential is special
Levy and be a little respectively allocated into subspace each described, according to the corresponding son of each time sequence characteristic point
The sequence number in space forms the second sequence for being associated with first frequency range or second frequency range respectively, and
And then form corresponding test identification feature;
Step S9, contrast is associated with the training identification feature of first frequency range and the test is recognized
Whether feature is similar, and the confirmation knot for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparing result
Really, or
For whether being associated with the training identification feature of second frequency range and the test identification feature
It is similar, and the confirmation result for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparing result.
2. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that institute
State in step S7, it is empty that each time sequence characteristic point is dispensed into each described son according to nearest neighbouring rule
In.
3. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that institute
State in step S7, by each the described subspace for being dispensed into the time sequence characteristic point according to the sequence number
A spatial sequence is constituted, and the spatial sequence is known as the First ray with forming the training
Other feature.
4. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that institute
State in step S8, by each the described subspace for being dispensed into the time sequence characteristic point according to the sequence number
A spatial sequence is constituted, and the control sequence is known as second sequence with forming the test
Other feature.
5. the method for recognizing sound-groove as claimed in claim 3 based on channel compensation, it is characterised in that institute
State in step S7, the spatial sequence includes being associated with the data group of each subspace, one
One sequence number of the data group correspondence;
After the spatial sequence is formed, in addition to respectively in first frequency range or second frequency
The process for the first data compression that the spatial sequence of section is carried out, be specially:
Step S71, the sequence number of each data group of record, and record is associated with each sequence number
Repetition sequence number quantity;
Step S72, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing
Step S73 is turned to when stating the data group that repetition sequence number quantity is 1;
Step S73, it is the 1 corresponding data group of the sequence number to delete the sequence number quantity that repeats;
Step S74, judge the deleted data group previous data group the sequence number whether with quilt
The sequence number of latter data group of the data group deleted is identical:
If identical, the previous data group and the latter data are combined simultaneously;
If differing, retain the previous data group and the latter data group;
Institute is formed after being performed both by first data compression to all data groups in the spatial sequence
State First ray.
6. the method for recognizing sound-groove as claimed in claim 4 based on channel compensation, it is characterised in that institute
State in step S8, the spatial sequence includes being associated with the data group of each subspace, one
One sequence number of the data group correspondence;
After the spatial sequence is formed, in addition to respectively in first frequency range or second frequency
The process for the second data compression that the spatial sequence of section is carried out, be specially:
Step S81, the sequence number of each data group of record, and record is associated with each sequence number
Repetition sequence number quantity;
Step S82, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing
Step S83 is turned to when stating the data group that repetition sequence number quantity is 1;
Step S83, it is the 1 corresponding data group of the sequence number to delete the sequence number quantity that repeats;
Step S84, judge the deleted data group previous data group the sequence number whether with quilt
The sequence number of latter data group of the data group deleted is identical:
If identical, the previous data group and the latter data are combined simultaneously;
If differing, retain the previous data group and the latter data group;
Institute is formed after being performed both by second data compression to all data groups in the spatial sequence
State the second sequence.
7. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that:Institute
It is U.S. Cepstrum Transform to state eigentransformation.
8. the method for recognizing sound-groove as claimed in claim 7 based on channel compensation, it is characterised in that:In
During performing the U.S. Cepstrum Transform, every sentence is divided into the frames of 20ms mono- respectively, and
10ms frame is pipetted out to the sentence frame for being associated with the sentence;
Then, remove Jing Yin in units of frame, help every frame after Cepstrum Transform to stay 12 to the sentence frame
Coefficient, and constituted the identification feature with 12 coefficients.
9. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that:Institute
State in step S6, identification feature space is divided into by several subspaces using " K- averages " algorithm, after division
Each subspace the described of the correspondence subspace be recorded as with the central point of " K- averages " respectively retouched
State information.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610025193.5A CN106971730A (en) | 2016-01-14 | 2016-01-14 | A kind of method for recognizing sound-groove based on channel compensation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610025193.5A CN106971730A (en) | 2016-01-14 | 2016-01-14 | A kind of method for recognizing sound-groove based on channel compensation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106971730A true CN106971730A (en) | 2017-07-21 |
Family
ID=59335188
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610025193.5A Pending CN106971730A (en) | 2016-01-14 | 2016-01-14 | A kind of method for recognizing sound-groove based on channel compensation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106971730A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108492830A (en) * | 2018-03-28 | 2018-09-04 | 深圳市声扬科技有限公司 | Method for recognizing sound-groove, device, computer equipment and storage medium |
CN111312283A (en) * | 2020-02-24 | 2020-06-19 | 中国工商银行股份有限公司 | Cross-channel voiceprint processing method and device |
CN113488058A (en) * | 2021-06-23 | 2021-10-08 | 武汉理工大学 | Voiceprint recognition method based on short voice |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20030075330A (en) * | 2002-03-18 | 2003-09-26 | 정희석 | Channel Mis-match Compensation apparatus and method for Robust Speaker Verification system |
CN101241699A (en) * | 2008-03-14 | 2008-08-13 | 北京交通大学 | A Speaker Confirmation System in Distance Chinese Teaching |
CN101661754A (en) * | 2003-10-03 | 2010-03-03 | 旭化成株式会社 | Data processing unit, method and control program |
CN101944359A (en) * | 2010-07-23 | 2011-01-12 | 杭州网豆数字技术有限公司 | Voice recognition method facing specific crowd |
CN102129859A (en) * | 2010-01-18 | 2011-07-20 | 盛乐信息技术(上海)有限公司 | Voiceprint authentication system and method for rapid channel compensation |
CN102623008A (en) * | 2011-06-21 | 2012-08-01 | 中国科学院苏州纳米技术与纳米仿生研究所 | voiceprint recognition method |
CN104185868A (en) * | 2012-01-24 | 2014-12-03 | 澳尔亚有限公司 | Voice authentication and speech recognition system and method |
CN104392718A (en) * | 2014-11-26 | 2015-03-04 | 河海大学 | Robust voice recognition method based on acoustic model array |
US20150066494A1 (en) * | 2013-09-03 | 2015-03-05 | Amazon Technologies, Inc. | Smart circular audio buffer |
-
2016
- 2016-01-14 CN CN201610025193.5A patent/CN106971730A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20030075330A (en) * | 2002-03-18 | 2003-09-26 | 정희석 | Channel Mis-match Compensation apparatus and method for Robust Speaker Verification system |
CN101661754A (en) * | 2003-10-03 | 2010-03-03 | 旭化成株式会社 | Data processing unit, method and control program |
CN101241699A (en) * | 2008-03-14 | 2008-08-13 | 北京交通大学 | A Speaker Confirmation System in Distance Chinese Teaching |
CN102129859A (en) * | 2010-01-18 | 2011-07-20 | 盛乐信息技术(上海)有限公司 | Voiceprint authentication system and method for rapid channel compensation |
CN101944359A (en) * | 2010-07-23 | 2011-01-12 | 杭州网豆数字技术有限公司 | Voice recognition method facing specific crowd |
CN102623008A (en) * | 2011-06-21 | 2012-08-01 | 中国科学院苏州纳米技术与纳米仿生研究所 | voiceprint recognition method |
CN104185868A (en) * | 2012-01-24 | 2014-12-03 | 澳尔亚有限公司 | Voice authentication and speech recognition system and method |
US20150066494A1 (en) * | 2013-09-03 | 2015-03-05 | Amazon Technologies, Inc. | Smart circular audio buffer |
CN104392718A (en) * | 2014-11-26 | 2015-03-04 | 河海大学 | Robust voice recognition method based on acoustic model array |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108492830A (en) * | 2018-03-28 | 2018-09-04 | 深圳市声扬科技有限公司 | Method for recognizing sound-groove, device, computer equipment and storage medium |
CN111312283A (en) * | 2020-02-24 | 2020-06-19 | 中国工商银行股份有限公司 | Cross-channel voiceprint processing method and device |
CN113488058A (en) * | 2021-06-23 | 2021-10-08 | 武汉理工大学 | Voiceprint recognition method based on short voice |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106971737A (en) | A kind of method for recognizing sound-groove spoken based on many people | |
CN109817246B (en) | Emotion recognition model training method, emotion recognition device, emotion recognition equipment and storage medium | |
CN107464568B (en) | Speaker identification method and system based on three-dimensional convolution neural network text independence | |
CN105374356B (en) | Audio recognition method, speech assessment method, speech recognition system and speech assessment system | |
CN108122556A (en) | Reduce the method and device that driver's voice wakes up instruction word false triggering | |
CN108172218B (en) | Voice modeling method and device | |
CN107492382A (en) | Voiceprint extracting method and device based on neutral net | |
CN106898355B (en) | Speaker identification method based on secondary modeling | |
CN107731233A (en) | A kind of method for recognizing sound-groove based on RNN | |
CN106952648A (en) | A kind of output intent and robot for robot | |
CN106205624A (en) | A kind of method for recognizing sound-groove based on DBSCAN algorithm | |
CN103207961A (en) | User verification method and device | |
Fong | Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification | |
CN106971724A (en) | A kind of anti-tampering method for recognizing sound-groove and system | |
CN106971730A (en) | A kind of method for recognizing sound-groove based on channel compensation | |
CN116564315A (en) | Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium | |
CN105845143A (en) | Speaker confirmation method and speaker confirmation system based on support vector machine | |
CN106971727A (en) | A kind of verification method of Application on Voiceprint Recognition | |
CN106971731A (en) | A kind of modification method of Application on Voiceprint Recognition | |
CN106887230A (en) | A kind of method for recognizing sound-groove in feature based space | |
CN116434758A (en) | Voiceprint recognition model training method and device, electronic equipment and storage medium | |
CN113643688B (en) | Mongolian voice feature fusion method and device | |
CN106981288A (en) | A kind of authentication method of Application on Voiceprint Recognition | |
CN115995106A (en) | A multi-modal safety protection method for construction site robots | |
CN113948089B (en) | Voiceprint model training and voiceprint recognition methods, devices, equipment and media |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170721 |
|
RJ01 | Rejection of invention patent application after publication |