CN108460081A - Voice data base establishing method, voiceprint registration method, apparatus, equipment and medium - Google Patents
Voice data base establishing method, voiceprint registration method, apparatus, equipment and medium Download PDFInfo
- Publication number
- CN108460081A CN108460081A CN201810031164.9A CN201810031164A CN108460081A CN 108460081 A CN108460081 A CN 108460081A CN 201810031164 A CN201810031164 A CN 201810031164A CN 108460081 A CN108460081 A CN 108460081A
- Authority
- CN
- China
- Prior art keywords
- voice data
- registration
- voice
- primary
- index
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/61—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/63—Querying
- G06F16/635—Filtering based on additional data, e.g. user or group profiles
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
- G06F16/68—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/683—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
- G06F16/685—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using automatically derived transcript of audio data, e.g. lyrics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/04—Training, enrolment or model building
Abstract
The invention discloses a kind of voice data base establishing method, voiceprint registration method, apparatus, equipment and media.The voice data base establishing method includes:Primary voice data is obtained, the primary voice data includes original user mark and voice collecting time;Primary voice data is pre-processed, efficient voice data are obtained;Obtain the corresponding signal-to-noise ratio of the efficient voice data;Efficient voice data are stored in speech database, and index is established for the efficient voice data in speech database, index includes original user mark, voice collecting time and signal-to-noise ratio.By the signal-to-noise ratio of pretreatment, calculating efficient voice data to primary voice data and after creating speech database, foundation includes user identifier, the index of voice collecting time and signal-to-noise ratio to the voice data base establishing method, improves database data treatment effeciency.
Description
Technical field
The present invention relates to data processing field more particularly to a kind of voice data base establishing method, voiceprint registration method, dresses
It sets, equipment and medium.
Background technology
With the development of artificial intelligence technology, face, voice and fingerprint etc. are gradually applied with the relevant technology of characteristics of human body
In real life.Vocal print is the sound wave spectrum for the carrying verbal information that electricity consumption acoustic instrument is shown, with specificity and relatively
The characteristics of stability.The generation of human language is a complicated physiology physics mistake between Body Languages maincenter and vocal organs
Journey, phonatory organ -- the difference of tongue, tooth, larynx, lung, nasal cavity in terms of size and form that everyone uses in speech
Very big, the voiceprint map of any two people is all variant, therefore can be verified to the identity of user by vocal print.In vocal print
It is typically all using real-time recording voice data and to carry out vocal print that identification process, which needs vocal print registered in advance, current voiceprint registration process,
The mode of extraction is registered.The consumption long period is required to from recorded speech data to voiceprint extraction, this causes entirely to note
Volume during take it is longer, registration it is less efficient.Moreover, when registering vocal print using real-time recording voice data, because when recording
Ambient condition and user's body health status so that the language acquired when for extracting the recorded speech data of vocal print with other
There are larger differences for sound data, to influence accuracy of the vocal print of real-time recording voice data extraction in Application on Voiceprint Recognition.
Invention content
A kind of voice data base establishing method of offer of the embodiment of the present invention, device, equipment and medium, to solve at database
Manage less efficient problem.
A kind of voiceprint registration method, apparatus of offer of the embodiment of the present invention, equipment and medium, to solve vocal print feature accuracy
Not high problem.
In a first aspect, the embodiment of the present invention provides a kind of voice data base establishing method, including:
Primary voice data is obtained, the primary voice data includes original user mark and voice collecting time;
The primary voice data is pre-processed, efficient voice data are obtained;
Obtain the corresponding signal-to-noise ratio of the efficient voice data;
The efficient voice data are stored in speech database, and are effective language in the speech database
Sound data establish index, and the index includes original user mark, voice collecting time and signal-to-noise ratio.
Second aspect, the embodiment of the present invention provide a kind of speech database creating device, including:
Primary voice data acquisition module, for obtaining primary voice data, the primary voice data includes original use
Family identifies and the voice collecting time;
Data preprocessing module obtains efficient voice data for being pre-processed to the primary voice data;
Signal-to-noise ratio acquisition module, for obtaining the corresponding signal-to-noise ratio of the efficient voice data;
Speech database index establishes module, for the efficient voice data to be stored in speech database, and is
The efficient voice data in the speech database establish index, and the index includes original user mark, voice collecting
Time and signal-to-noise ratio.
The third aspect, the embodiment of the present invention provide a kind of voiceprint registration method, including:
Voiceprint registration request is obtained, the voiceprint registration request includes registration user identifier and current time;
Based on the registration user identifier voice inquirement database, obtain corresponding original with the registration user identifier
The corresponding target index of user identifier, the speech database are using the voice data base establishing method wound described in first aspect
The speech database built;
According to voice collecting time, signal-to-noise ratio and the current time that the target indexes, each target is obtained
Index corresponding composite index;
It chooses the highest target of composite index and indexes corresponding efficient voice data, as registration voice data;
Based on the registration voice data, corresponding vocal print feature is obtained as registration vocal print.
Fourth aspect, the embodiment of the present invention provide a kind of voiceprint registration device, including:
Voiceprint registration acquisition request module, for obtaining voiceprint registration request, the voiceprint registration request includes that registration is used
Family identifies and current time;
Target index obtains certain block, for being based on the registration user identifier voice inquirement database, obtains and the note
The corresponding original user of volume user identifier identifies corresponding target index, and the speech database is using described in first aspect
Voice data base establishing method create speech database;
Composite index acquisition module, voice collecting time, signal-to-noise ratio for index according to the target and it is described currently
Time obtains each target and indexes corresponding composite index;
Voice data acquisition module is registered, corresponding efficient voice number is indexed for choosing the highest target of composite index
According to as registration voice data;
Vocal print acquisition module is registered, for being based on the registration voice data, obtains corresponding vocal print feature as registration
Vocal print.
Fifth aspect present invention provides a kind of terminal device, including memory, processor and is stored in the memory
In and the computer program that can run on the processor, the processor realize such as this hair when executing the computer program
Described in bright first aspect the step of voice data base establishing method;Alternatively, reality when the processor executes the computer program
Now as described in third aspect present invention the step of voice data base establishing method.
Sixth aspect present invention provides a kind of computer readable storage medium, and the computer-readable recording medium storage has
Computer program realizes that speech database creates as described in the first aspect of the invention when the computer program is executed by processor
The step of method;Alternatively, the processor realizes the voice number as described in third aspect present invention when executing the computer program
The step of according to base establishing method.
In voice data base establishing method provided in an embodiment of the present invention, device, equipment and storage medium, pass through obtain it is former
Beginning voice data provides data source to create speech database.Primary voice data is pre-processed again, to obtain effectively
Voice data saves data processing time to improve subsequent treatment effeciency.The corresponding signal-to-noise ratio of efficient voice data is obtained,
By the signal-to-noise ratio, the noise level of efficient voice data can be intuitively judged, to know the language of efficient voice data
Sound quality.Finally efficient voice data are stored in speech database, and are built for the efficient voice data in speech database
Lithol draws, and index includes original user mark, voice collecting time and signal-to-noise ratio.The voice data base establishing method passes through to original
The pretreatment of beginning voice data, the signal-to-noise ratio for calculating efficient voice data and to be established after creating speech database include use
Family mark, the index of voice collecting time and signal-to-noise ratio, improve database data treatment effeciency, also increase vocal print feature
Accuracy.Further, it is also possible to which the follow-up voiceprint registration stage is facilitated quickly to navigate to suitable efficient voice data.Pass through voice number
According to reasonable setting of library during establishment, the accuracy of the vocal print feature extraction in follow-up voiceprint registration stage is improved, is reduced
The registion time of voiceprint registration.
In voiceprint registration method, apparatus provided in an embodiment of the present invention, equipment and storage medium, which adopts
The speech database that the voice data base establishing method provided with first aspect present invention creates carries out voiceprint registration, improves sound
The accuracy of line registration phase vocal print feature extraction, the registion time for reducing voiceprint registration.Mesh is based on during voiceprint registration
Mark indexes to obtain the composite index of corresponding effective voice data, in favor of quickly navigating to suitable efficient voice data, with
Guarantee extracts the vocal print feature the most identical with user, further improves the accuracy of voiceprint registration.
Description of the drawings
In order to illustrate the technical solution of the embodiments of the present invention more clearly, below by institute in the description to the embodiment of the present invention
Attached drawing to be used is needed to be briefly described, it should be apparent that, the accompanying drawings in the following description is only some implementations of the present invention
Example, for those of ordinary skill in the art, without having to pay creative labor, can also be according to these attached drawings
Obtain other attached drawings.
Fig. 1 is a flow chart of the voice data base establishing method provided in the embodiment of the present invention 1;
Fig. 2 is a flow chart of a specific implementation mode of step S12 in Fig. 1;
Fig. 3 is a flow chart of another specific implementation mode of step S12 in Fig. 1;
Fig. 4 is a functional block diagram of the speech database creating device provided in the embodiment of the present invention 2;
Fig. 5 is a flow chart of the voiceprint registration method provided in the embodiment of the present invention 3;
Fig. 6 is a flow chart of a specific implementation mode in the embodiment of the present invention 3;
Fig. 7 is a functional block diagram of the voiceprint registration device provided in the embodiment of the present invention 4;
Fig. 8 is a schematic diagram of the terminal device provided in the embodiment of the present invention 6.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation describes, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair
Embodiment in bright, the every other implementation that those of ordinary skill in the art are obtained without creative efforts
Example, shall fall within the protection scope of the present invention.
Embodiment 1
Fig. 1 shows the flow chart of voice data base establishing method in the present embodiment.The voice data base establishing method application
In various terminal equipment or server, for creating speech database, to solve the problems, such as that database processing is less efficient.Such as
Shown in Fig. 1, which includes the following steps:
S11:Primary voice data is obtained, primary voice data includes original user mark and voice collecting time.
Wherein, primary voice data refers to untreated voice data after acquisition.Original user mark is for distinguishing
The mark of different user, a corresponding unique subscriber of original user mark.In a specific embodiment, original user mark
Knowledge can be subscriber phone number, user account or identification card number etc..The voice collecting time refers to what primary voice data acquired
Time.
Preferably, primary voice data can be obtained from the database that acquisition has a large number of users voice data.For example, portion
Point enterprise can set up customer service hotline, and user solves it using the enterprise by dialing this customer service hotline
The problem of being encountered during product or service, enterprise also can carry out product promotion by this customer service hotline to client
Or pay a return visit etc..Usually, enterprise can record to above-mentioned call, and the voice data of recording is stored in a database
In.Alternatively, in some application programs, when carrying out interactive voice between user or between user and customer service, the number of application program
The voice data of user can be stored with according to library.
S12:Primary voice data is pre-processed, efficient voice data are obtained.
Primary voice data is untreated data after acquisition, therefore may include in primary voice data
In vain, the voice data of redundancy.For example, voice duration does not reach requirement in primary voice data, primary voice data includes
It is not belonging to the voice data of user, it is invalid, redundancy language that the voice quality of primary voice data is undesirable etc.
Sound data.Alternatively, can have the speech period of partial invalidity or redundancy in a primary voice data, this partial redundance or
The presence of invalid speech period can bring deleterious effect to subsequent language data process process, therefore need to remove this part
Redundancy or invalid speech period, wherein speech period is the part in primary voice data.By to primary voice data
It is pre-processed, to obtain efficient voice data, to improve the treatment effeciency of subsequent voice data, to save the time.
S13:Obtain the corresponding signal-to-noise ratio of efficient voice data.
Signal-to-noise ratio (signal-to-noise ratio, SNR) is the ratio for describing active ingredient and noise contribution in signal
Relation Parameters.Signal-to-noise ratio is higher to illustrate that noise is relatively fewer, by obtaining the signal-to-noise ratio of efficient voice data, can intuitively sentence
Break and the size of noise in efficient voice data, to know the voice quality of efficient voice data.Specifically, meter can be passed through
The mode of calculation obtains the corresponding signal-to-noise ratio of efficient voice data.
When obtaining the corresponding signal-to-noise ratio of efficient voice data by the way of calculating, the calculation formula of signal-to-noise ratio can be with
For:SNR=10Lg (PS/PN), wherein PSAnd PNRespectively represent active ingredient and the effective power of noise contribution.Optionally,
It can also be converted into the ratio of voltage amplitude, i.e. the calculation formula of signal-to-noise ratio can also be expressed as:SNR=20Lg (VS/
VN), wherein VSAnd VNRespectively represent the virtual value of active ingredient voltage and noise contribution voltage.
In one embodiment, the corresponding signal-to-noise ratio of efficient voice data is obtained, following steps are specifically included:
First, the fundamental tone data in efficient voice data are extracted using Pitch-Synchronous OLA algorithm.Fundamental tone data are effective language
Normal voice data in sound data, and noise data are opposite.Preferably, may be used spectrum-subtraction, Wiener Filter Method or
Minimum Mean-Square Error Short-Time Spectral Estimation method extracts fundamental tone data from voice data.
Then, according to the noise data in fundamental tone data acquisition efficient voice data.Fundamental tone number is extracted from voice data
According to rear, the voice data of remaining part is exactly the noise data in voice data.
Finally, the signal-to-noise ratio of voice data is calculated according to fundamental tone data and noise data.It is obtained from efficient voice data
After the fundamental tone data and the noise data that obtain efficient voice data, you can calculate efficient voice according to fundamental tone data and noise data
The signal-to-noise ratio of data.Specifically, can first calculate fundamental tone data and noise data effective power or calculate fundamental tone data and
The voltage magnitude of noise data, then the ratio of the two is calculated, to obtain the signal-to-noise ratio of efficient voice data.
In a specific embodiment, after the step of obtaining efficient voice data corresponding signal-to-noise ratio, further include:
Remove the efficient voice data that signal-to-noise ratio is less than snr threshold.
After the signal-to-noise ratio for getting efficient voice data, can the efficient voice data too low to signal-to-noise ratio go
Except processing, to reduce data volume, play the role of the pressure for alleviating data processing and storage.Specifically, a letter can be set
It makes an uproar than threshold value, when the signal-to-noise ratio of efficient voice data is less than this snr threshold, illustrates making an uproar for this section of efficient voice data
Sound is very high, therefore this section of efficient voice data are not suitable as a voice data for being used for carrying out voiceprint extraction.Pass through
The efficient voice data that signal-to-noise ratio is less than snr threshold are removed, data volume is reduced, to alleviate the pressure of data processing and storage,
Subsequent data processing time can also be reduced, treatment effeciency is improved.
S14:Efficient voice data are stored in speech database, and are built for the efficient voice data in speech database
Lithol draws, and index includes original user mark, voice collecting time and signal-to-noise ratio.
Wherein, speech database is the database for storing effective voice data.To through and pretreatment and calculate noise
It is stored in speech database than efficient voice data later, and index is established for every section of efficient voice data, after raising
The continuous efficiency that data processing is carried out using the speech database.Moreover, can by way of search index in voiceprint registration
To be directly targeted to suitable efficient voice data, and go out vocal print feature from corresponding efficient voice extracting data, can be improved
The accuracy of vocal print feature.
Specifically, index includes original user mark, voice collecting time and signal-to-noise ratio.Original user is identified for distinguishing
The efficient voice data of different user.The voice collecting time represents the recording time of voice, in general, the sound meeting of user
Small variation is had with the migration of time.The voice collecting time is closer from current time, then represents this section of efficient voice number
According to being more close with the voice that user currently records, to which vocal print feature is also more identical.And it can be straight by signal-to-noise ratio
The noise level that efficient voice data are judged on ground is seen, to know the voice quality of efficient voice data.
In one embodiment, be the index that efficient voice data in speech database are established it is brin indexes.
Brin indexes store the consecutive data block section of table and corresponding data value range, are existed using brin indexes
Saving on system space has big advantage, needs to store the corresponding efficient voice of a large amount of original users marks in speech database
Data, it is more demanding to the memory space of database, by using brin indexes, a large amount of index space can be saved.
Therefore, the foundation indexed by speech database, improves database processing efficiency, also increases vocal print feature
Accuracy.Moreover, in the voiceprint registration stage, the mark of the original user in index, voice collecting time and signal-to-noise ratio can be passed through
Three considers, in favor of quickly navigating to most suitable efficient voice data, and according to the vocal print of the efficient voice data
Feature is registered, and is greatly reduced and is formed the time of vocal print feature in voiceprint registration stage, and is most suitable by selecting
Efficient voice data, the accuracy of the voiceprint registration also improved.
In voice data base establishing method provided in an embodiment of the present invention, by obtaining primary voice data, to create language
Sound database provides data source.Primary voice data is pre-processed again, it is follow-up to improve to obtain efficient voice data
Treatment effeciency, save data processing time.The corresponding signal-to-noise ratio of efficient voice data is obtained, the signal-to-noise ratio, Ke Yizhi are passed through
The noise level that efficient voice data are judged on ground is seen, to know the voice quality of efficient voice data.Finally by effective language
Sound data are stored in speech database, and establish index for the efficient voice data in speech database, and index includes original
User identifier, voice collecting time and signal-to-noise ratio.The voice data base establishing method by pretreatment to primary voice data,
It includes user identifier, voice collecting time to calculate the signal-to-noise ratio of efficient voice data and established after creating speech database
With the index of signal-to-noise ratio, the accuracy for improving database processing efficiency, also increasing vocal print feature.Further, it is also possible to convenient
The follow-up voiceprint registration stage quickly navigates to suitable efficient voice data.It is reasonable during establishment by speech database
Setting, when improving the accuracy of the vocal print feature extraction in follow-up voiceprint registration stage, greatly reducing the registration of voiceprint registration
Between.
In a specific embodiment, primary voice data is pre-processed, obtains efficient voice data, it is specific to wrap
Include following steps:Corresponding primary voice data is identified to each original user and is filtered processing and mute removal processing, is obtained
Take efficient voice data.
In same original user identifies corresponding primary voice data, it is possible to which there are minorities to be not belonging to the original user
The primary voice data (the case where i.e. other people use) of corresponding user is identified, the primary voice data preserves just at this time
Be not the voice data that the original user identifies corresponding user, need the removal of this part primary voice data, to avoid
There is deviation when subsequently based on primary voice data extraction vocal print feature.
Therefore, corresponding primary voice data is identified to each original user and is filtered processing, be from raw tone
It is found out in data and is not belonging to the primary voice data that the original user identifies corresponding user, and by the original language in this part
Sound data remove.Specifically, clustering algorithm may be used or comparison and matched mode are not belonging to user's sheet to find out one by one
The primary voice data of people.
In one section of primary voice data, it is possible to there can be voice data in partial period and be in the mute stage, such as
Waiting period in communication process.The corresponding voice data of this partial period belongs to invalid or redundancy voice data, needs
Carry out mute removal processing.
Preferably, voiced activity detection (VAD, Voice Activity Detection) may be used to raw tone number
According to being detected, to identify that phonological component and non-speech portion, non-speech portion are mute part, mute part is gone
It removes, mute primary voice data is removed to obtain.
Voiced activity detection, whether the purpose is to detect comprising voice signal presence in current speech signal, i.e., to input
Voice data is judged, the voice signal in voice data is distinguished with various ambient noise signals, respectively to two kinds
Signal uses different processing methods.By voiced activity detection, identify phonological component in one section of primary voice data and
Mute part, and mute part is removed, remove mute primary voice data to obtain.
It is to be appreciated that identifying corresponding primary voice data to each original user is filtered processing and mute removal
The execution sequence of processing can be replaced, and passing through filtration treatment and mute removal, treated that voice data is known as effective language
Sound data.It can first carry out carrying out mute removal again after filtration treatment, can also first carry out carrying out again after mute removal
Filtration treatment.
In this embodiment, the corresponding user's sheet of original user mark is not belonging in primary voice data by removing
The data of people improve the accuracy of the data stored in speech database.And primary voice data is carried out at mute removal
After reason, the processing time of follow-up data processing is reduced, treatment effeciency is improved.
In a specific embodiment, corresponding primary voice data is identified to each original user and is filtered place
Reason, as shown in Fig. 2, specifically including following steps:
S121:Extract the vocal print feature that same original user identifies corresponding primary voice data;
It is identified based on original user, identifying corresponding primary voice data progress vocal print feature to same original user carries
It takes.Vocal print feature refers to the essential characteristic of characterization people in primary voice data, such as the frequency bandwidth of the profile of fundamental tone, formant
And track, spectrum envelop parameter, auditory properties parameter, linear prediction are washed one's face and rinsed one's mouth and its derive from parameter or hybrid parameter etc..Specifically,
It can be based on linear predictive coding (LPC, Linear Predictive Coding) or mel cepstrum coefficients (MFCC, Mel
Frequency Cepstral Coefficient) carry out vocal print feature extraction.
S122:Based on vocal print feature, same original user is identified into corresponding primary voice data and is clustered using k-means
Algorithm carries out clustering, obtains target's center's point.
Wherein, clustering is also known as cluster analysis, it is a kind of statistical analysis side for studying (sample or index) classification problem
Method, while being also an important analysis method of data mining.Cluster point is carried out using k-means algorithms to primary voice data
Analysis obtains target's center's point.Specifically, the quantity set K values of corresponding primary voice data are identified according to same original user,
And set the initial center point of each clustering cluster.After all the points (primary voice data) are all assigned, to this clustering cluster
In all the points recalculate (such as calculating average value) and obtain the new central point of the cluster.Then again by way of iteration into
The step of row distributing center point and the central point of update clustering cluster, until the central point of clustering cluster varies less, or reach
Specified iterations.Using the central point of the clustering cluster corresponding to the most point (primary voice data) of quantity as target's center
Point.
S123:Using distance algorithm, calculates same original user and identify in corresponding each primary voice data and target
The distance of heart point.
Distance algorithm refers to the algorithm of the similarity measurement between the different samples of estimation.It in one embodiment, can be with
Each raw tone number is calculated using manhatton distance, Minkowski Distance, cosine similarity or Euclidean distance scheduling algorithm
According at a distance from target's center point.
In one embodiment, each primary voice data and target's center's point are calculated using Euclidean distance algorithm
Euclidean distance.
Euclidean distance algorithm refers to the actual distance between two points in m-dimensional space, or the natural length of vector (i.e. should
Distance of the point to origin).Any two n-dimensional vector a (Xi1,Xi2,...,Xin) and b (Xj1,Xj2,...,Xjn) Euclidean distance beBased on the vocal print feature of each primary voice data, calculated by Euclidean distance algorithm each
The Euclidean distance of primary voice data and target's center's point.
S124:Remove same original user identify it is big at a distance from target's center point in corresponding each primary voice data
In the primary voice data of distance threshold.
After clustering, in same original user identifies corresponding primary voice data, belong to same user
Primary voice data can cluster near heart point in the target, be at a distance from this part primary voice data and target's center point
Very little.And be not belonging to user primary voice data can far from target's center point, i.e., this part primary voice data with
The distance of target's center's point is bigger.It therefore, can be by same original user by the way that a rational distance threshold is arranged
It identifies and is not belonging to the primary voice data of user in corresponding primary voice data and screens and be removed, to protect
Demonstrate,prove the accuracy of data.
In this embodiment, same original user corresponding primary voice data is identified to carry out using clustering algorithm
Clustering, and calculate same original user identify corresponding each primary voice data and clustering cluster target's center's point away from
From, then remove the primary voice data that distance is more than distance threshold.By the removal of the primary voice data to mistake, ensure that
The accuracy of data, while data volume is reduced, also improve data-handling efficiency.
In a specific embodiment, corresponding primary voice data is identified to each original user and is filtered place
Reason, as shown in figure 3, specifically including following steps:
S121’:Extract the vocal print feature that same original user identifies corresponding primary voice data.
It is identified based on original user, identifying corresponding primary voice data progress vocal print feature to same original user carries
It takes.Specifically, linear predictive coding (LPC, Linear Predictive Coding) or mel cepstrum coefficients can be based on
(MFCC, Mel Frequency Cepstral Coefficient) carries out the extraction of vocal print feature.
S122’:It will be in the corresponding vocal print feature of each primary voice data and same user identifier in same user identifier
The corresponding vocal print feature of remaining primary voice data compare and match one by one, and according to matching result, statistics is each original
Voice data it fails to match number.
Wherein, matching result includes successful match and it fails to match two kinds of results.It is corresponding original in same user identifier
In voice data, when there is the primary voice data for being not belonging to user, the vocal print feature of the part primary voice data
Vocal print feature with the primary voice data for belonging to user is unmatched (i.e. it fails to match).Therefore, by will be same
Remaining primary voice data pair in each corresponding vocal print feature of primary voice data and same user identifier in user identifier
The vocal print feature answered compare and match one by one, wherein being not belonging to the primary voice data of user and belonging to user
Primary voice data carry out vocal print feature comparison when, matching result will be that it fails to match.
S123’:When one section of primary voice data it fails to match number is more than matching threshold, the raw tone number is removed
According to.
When one section of primary voice data it fails to match number is more, illustrate the vocal print feature of this section of primary voice data
Vocal print feature with other most of primary voice datas is unmatched.In this way, it may determine that going out this section of raw tone number
What it is according to middle storage is the primary voice data for being not belonging to user.Therefore, a matching threshold can be preset, when one section
When primary voice data it fails to match number is more than the matching threshold, the primary voice data is removed, ensure that data
Accuracy, while data volume is reduced, also improve data-handling efficiency.
In a specific embodiment, corresponding primary voice data is identified to each original user and is filtered place
Reason, further includes following specific steps:
Judge that same original user identifies whether corresponding primary voice data amount is greater than or equal to cluster threshold value;If same
Original user identifies corresponding primary voice data amount and is greater than or equal to cluster threshold value, thens follow the steps S121-S124;If same
Original user identifies corresponding primary voice data amount and is less than cluster threshold value, thens follow the steps S121 '-S123 '.
For clustering algorithm, the accuracy and data volume of clustering are proportionate.When data volume is little
When, cluster accuracy decreases, and is handled using clustering algorithm in the case where data volume is little, and calculating can be increased
Complexity.Therefore, a cluster threshold value can be set, the concrete numerical value of the cluster threshold value can be according to algorithm characteristic and actual demand
Adjustment.Preferably, which is 10.It is more than or equal to cluster when same original user identifies corresponding primary voice data amount
When threshold value, just the embodiment of step S121-S124 is used to be filtered processing to primary voice data.And when data volume is less than
When clustering threshold value, then processing is filtered to primary voice data using step S121 '-S123 '.
In this embodiment, suitable Processing Algorithm is selected to carry out primary voice data by the size of data volume
Filtration treatment improves the accuracy of data processing.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process
Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit
It is fixed.
Embodiment 2
Fig. 4 shows the original with the one-to-one speech database creating device of voice data base establishing method in embodiment 1
Manage block diagram.As shown in figure 4, the speech database creating device includes primary voice data acquisition module 11, data prediction mould
Block 12, signal-to-noise ratio acquisition module 13 and speech database index establish module 14.Wherein, primary voice data acquisition module 11,
Data preprocessing module 12, signal-to-noise ratio acquisition module 13 and speech database index establish the realization function and embodiment of module 14
The corresponding step of voice data base establishing method corresponds in 1, and to avoid repeating, the present embodiment is not described in detail one by one.
Primary voice data acquisition module 11, for obtaining primary voice data, primary voice data includes original user
Mark and voice collecting time.
Data preprocessing module 12 obtains efficient voice data for being pre-processed to primary voice data.
Signal-to-noise ratio acquisition module 13, for obtaining the corresponding signal-to-noise ratio of efficient voice data.
Speech database index establishes module 14, for efficient voice data to be stored in speech database, and is language
Efficient voice data in sound database establish index, and index includes original user mark, voice collecting time and signal-to-noise ratio.
Preferably, data preprocessing module 12 includes vocal print feature extraction unit 121, cluster analysis unit 122, distance meter
Calculate unit 123, the first data removal unit 124.
Vocal print feature extraction unit 121 identifies the vocal print of corresponding primary voice data for extracting same original user
Feature.
Same original user is identified corresponding primary voice data by cluster analysis unit 122 for being based on vocal print feature
Clustering is carried out using k-means clustering algorithms, obtains target's center's point.
Metrics calculation unit 123 calculates same original user and identifies corresponding each original language for using distance algorithm
Sound data are at a distance from target's center point.
First data removal unit 124 is identified for removing same original user in corresponding each primary voice data
It is more than the primary voice data of distance threshold at a distance from target's center point.
Preferably, data preprocessing module 12 further includes data comparison and matching unit 122 ' and the second data removal unit
123’。
Data comparison and matching unit 122 ' are used for the corresponding vocal print of each primary voice data in same user identifier
The corresponding vocal print feature of remaining primary voice data compare and match one by one in feature and same user identifier, according to
With as a result, counting each primary voice data it fails to match number.
Second data removal unit 123 ', for being more than matching threshold in one section of primary voice data it fails to match number
When, remove the primary voice data.
Preferably, data preprocessing module 12 further includes primary voice data amount judging unit 120.
Primary voice data amount judging unit 120, for judging that same original user identifies corresponding primary voice data
Whether amount is greater than or equal to cluster threshold value.
Embodiment 3
Fig. 5 shows the flow chart of voiceprint registration method in the present embodiment.The voiceprint registration method is applied to be set in various terminals
In standby and server, for carrying out voiceprint registration, to solve to take longer, vocal print feature accuracy not during voiceprint registration
High problem.As shown in figure 5, the voiceprint registration method includes the following steps:
S21:Voiceprint registration request is obtained, voiceprint registration request includes registration user identifier and current time.
Wherein, voiceprint registration request refers to the request registered using vocal print feature that user proposes.Register user's mark
Know the mark for identifying the user for proposing voiceprint registration request.In a specific embodiment, registration user identifier can be with
It is subscriber phone number, user account or identification card number.Preferably, registration user identifier is corresponding with original user mark
, for example, when original user is identified as phone number, user identifier is registered also as phone number.Current time refers to obtaining
The current time of system when being asked to voiceprint registration.
S22:Based on registration user identifier voice inquirement database, the original user to match with registration user identifier is obtained
Corresponding target index is identified, speech database is the voice data created using the voice data base establishing method of embodiment 1
Library.
Registration user identifier in being asked based on voiceprint registration, is inquired in speech database, and speech database
It is the speech database created using the voice data base establishing method of embodiment 1.When the original user mark in an index
When matching with registration user identifier, which is target index.Original user identifies to match with registration user identifier
It is identical with registration user identifier to refer to original user mark.Specifically, by efficient voice data foundation in speech database
Index is inquired, and inquiry includes the index of the original user mark to match with registration user identifier, obtains target index.
S23:The voice collecting time indexed according to current time, target and signal-to-noise ratio obtain each target index and correspond to
Composite index.
Wherein, the voice collecting time general proxy recording time of voice, the sound of user can be with the migration of time
There is small variation.The voice collecting time is closer from current time, then represents this section of efficient voice data and the current language of user
Sound is closer, to which vocal print feature is also more identical.And efficient voice data can intuitively be judged by signal-to-noise ratio
Noise level, signal-to-noise ratio is higher, then the noise of efficient voice data is smaller, can know the voice of efficient voice data accordingly
Quality.
Based on current time, voice collecting time and signal-to-noise ratio are considered, it is corresponding can to obtain each target index
Composite index.
S24:It chooses the highest target of composite index and indexes corresponding efficient voice data, as registration voice data.
It refers to vocal print feature and the most identical efficient voice data of user to register voice data.In target index,
Target indexes the vocal print feature and use that corresponding composite index is higher, is obtained in the efficient voice data corresponding from target index
Family is just more identical.Therefore, the highest target of composite index can be chosen and index corresponding efficient voice data, as note
Volume voice data, improves the accuracy of registration vocal print.
In a specific embodiment, it according to current time, the voice collecting time of target index and signal-to-noise ratio, obtains
Each target indexes corresponding composite index, specifically includes:The voice collecting time indexed according to current time, target and noise
Than calculating each target using composite index calculation formula and indexing corresponding composite index.The composite index calculation formula is:
Composite index=a* signal-to-noise ratio+(1-a) * [1/ (current time-voice collecting time)];
Wherein, a is default weight, and 0≤a≤1.
In efficient voice data, signal-to-noise ratio is higher, and noise signal is fewer in the efficient voice data.And when voice collecting
Between it is closer from current time, then it is closer to represent this section of efficient voice data and the current voice of user, to which vocal print is special
Sign is also more close.Therefore, the two factors are based on, further according to the demand of practical application scene, are equipped with for the two factors
Default weight, the composite index of each efficient voice data is can be obtained by composite index calculation formula.It obtains each effective
After the composite index of voice data, each efficient voice data can be weighed by this intuitive numerical value by composite index,
To select target effective voice data the most suitable.
For example, it is 0.7 that default weight a, which can be arranged, composite index calculation formula is at this time:Composite index=0.7* noises
Than+0.3* [1/ (current time-voice collecting time)].After getting any voiceprint registration request, noted according to the vocal print
Registration user identifier inquiry in volume request obtains the efficient voice data being stored in speech database, and refers to according to the terminal
The composite index of each efficient voice data of number calculation formula.
S25:Based on registration voice data, corresponding vocal print feature is obtained as registration vocal print.
After getting registration voice data, it is based on the registration voice data, corresponding vocal print feature is obtained, as registration
Vocal print.
In a specific embodiment, the vocal print feature of efficient voice data can be extracted in advance, and this can be made to have
The vocal print feature of effect phonetic feature is associated with the index in step S14, to be based on the index fast search to corresponding vocal print
Feature.In the voiceprint registration stage, after obtaining registration voice data, so that it may corresponding to directly acquire the registration voice data
Vocal print feature further reduces the time of voiceprint registration as registration vocal print.
In voiceprint registration method provided in an embodiment of the present invention, voiceprint registration request is obtained, to trigger voiceprint registration.Base again
In registration user identifier voice inquirement database, target corresponding with the original user mark that registration user identifier matches is obtained
Index, wherein speech database are the speech databases created using the voice data base establishing method of embodiment 1.According to current
Time, the voice collecting time of target index and signal-to-noise ratio, obtain each target and index corresponding composite index, pass through target rope
The composite index of corresponding efficient voice data can be obtained by drawing.It chooses the highest target of composite index and indexes corresponding effective language
Sound data improve the accuracy of registration vocal print as registration voice data.After getting registration voice data, based on registration
Voice data obtains corresponding vocal print feature as registration vocal print.The voiceprint registration method uses the voice data in embodiment 1
The speech database that base establishing method creates carries out voiceprint registration, improves the accurate of voiceprint registration stage vocal print feature extraction
Property, the registion time for reducing voiceprint registration.Corresponding effective voice data is obtained during voiceprint registration based on target index
Composite index ensure to get and coincide the most with user in favor of quickly navigating to suitable efficient voice data
Vocal print feature further improves the accuracy of voiceprint registration.
In a specific embodiment, it is based on registration user identifier voice inquirement database, as shown in fig. 6, further including
Following steps:
S221:If there is no the original users to match with registration user identifier to identify in speech database, language is sent
Sound recording request.
In speech database, the efficient voice data that user identifier matches may be not present and be registered, are passed through at this time
Voice recording request is sent, registration vocal print is obtained by the way of obtaining voice recording data in real time.Specifically, pass through registration
User identifier is inquired in the index in speech database, if original there is no matching with registration user identifier in the index
User identifier, then be not present and efficient voice data that the registration user identifier matches in the speech database, then sends language
Sound recording request.
S222:It obtains voice recording and asks corresponding voice recording data.
After sending voice recording request, user can record according to its voice of prompt typing, recorded speech data recording
After finishing, then the voice recording data are obtained.
S223:From the corresponding vocal print feature of voice recording extracting data as registration vocal print.
After obtaining the voice recording data that user records, make from the corresponding vocal print feature of the voice recording extracting data
To register vocal print.Herein, vocal print feature refer in primary voice data characterize people essential characteristic, such as the profile of fundamental tone, altogether
The frequency bandwidth at peak of shaking and track, spectrum envelop parameter, auditory properties parameter, linear prediction are washed one's face and rinsed one's mouth and its derive from parameter or mixing
Parameter etc., extracting mode can refer to the step S121 in previous embodiment, and therefore not to repeat here.
In this embodiment, when there is no the corresponding efficient voice data of registration user identifier in speech database
When, the case where registration vocal print is obtained by the way of real-time recording voice data, avoids the user that from can not registering, is occurred, and is improved
The integrality and reasonability of voiceprint registration method.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in above-described embodiment, each process
Execution sequence should be determined by its function and internal logic, the implementation process without coping with the embodiment of the present invention constitutes any limit
It is fixed.
Embodiment 4
Fig. 7 shows the functional block diagram with the one-to-one voiceprint registration device of voiceprint registration method in embodiment 3.Such as Fig. 7
Shown, which includes voiceprint registration acquisition request module 21, target indexes acquisition module 22, synthesis refers to
Number acquisition module 23, registration voice data acquisition module 24 and registration vocal print acquisition module 25.Wherein, voiceprint registration acquisition request
Module 21, target index acquisition module 22, composite index acquisition module 23, registration voice data acquisition module 24 and registration vocal print
The realization function of acquisition module 25 step corresponding with voiceprint registration method in embodiment 3 corresponds, to avoid repeating, this reality
Example is applied not to be described in detail one by one.
Voiceprint registration acquisition request module 21, for obtaining voiceprint registration request, voiceprint registration request includes registration user
Mark and current time.
Target indexes acquisition module 22, for based on registration user identifier voice inquirement database, obtaining and registration user
It identifies corresponding original user and identifies corresponding target index, speech database is the voice data described using embodiment 1
The speech database that base establishing method creates.
Composite index acquisition module 23, the voice collecting time for being indexed according to current time, target and signal-to-noise ratio, are obtained
Each target is taken to index corresponding composite index.
Voice data acquisition module 24 is registered, corresponding efficient voice number is indexed for choosing the highest target of composite index
According to as registration voice data.
Vocal print acquisition module 25 is registered, for based on registration voice data, obtaining corresponding vocal print feature as registration sound
Line.
Preferably, target index acquisition module 22 further includes that voice recording request transmitting unit 221, voice recording data obtain
Take unit 222 and registration voiceprint extraction unit 223.
Voice recording request transmitting unit 221, for there is no match with registration user identifier in speech database
Original user mark, then send voice recording request.
Voice recording data capture unit 222 asks corresponding voice recording data for obtaining voice recording.
Voiceprint extraction unit 223 is registered, is used for from the corresponding vocal print feature of voice recording extracting data as registration sound
Line.
Embodiment 5
The present embodiment provides a computer readable storage medium, computer journey is stored on the computer readable storage medium
Sequence realizes voice data base establishing method in embodiment 1, or realizes embodiment 3 when the computer program is executed by processor
Middle voiceprint registration method, to avoid repeating, which is not described herein again.Alternatively, being realized when the computer program is executed by processor real
The function of each module/unit in speech database creating device in example 2 is applied, or is realized in embodiment 4 in voiceprint registration device
The function of each module/unit, to avoid repeating, which is not described herein again.
Embodiment 6
Fig. 8 is the schematic diagram for the terminal device that one embodiment of the invention provides.As shown in figure 8, the terminal of the embodiment is set
Standby 80 include:Processor 81, memory 82 and it is stored in the computer journey that can be run in memory 82 and on processor 81
Sequence 83.The step of processor 81 realizes voice data base establishing method in above-described embodiment 1 when executing computer program 83, such as
Step S11 to S14 shown in FIG. 1.Alternatively, processor 81 realizes each module/unit in embodiment 2 when executing computer program 83
Function, such as primary voice data acquisition module shown in Fig. 4 11, data preprocessing module 12,13 and of signal-to-noise ratio acquisition module
Speech database indexes the function of establishing module 14.Alternatively, processor 81 realizes above-described embodiment 3 when executing computer program 83
The step of middle voiceprint registration method, such as step S21 to S25 shown in fig. 5.Alternatively, processor 81 executes computer program 83
The function of each module/unit in Shi Shixian embodiments 4, such as the module of voiceprint registration acquisition request shown in Fig. 7 21, target index obtain
The function of modulus block 22, composite index acquisition module 23, registration voice data acquisition module 24 and registration vocal print acquisition module 25.
Illustratively, computer program 83 can be divided into one or more module/units, one or more mould
Block/unit is stored in memory 82, and is executed by processor 81, to complete the present invention.One or more module/units can
To be the series of computation machine program instruction section that can complete specific function, the instruction segment is for describing computer program 83 at end
Implementation procedure in end equipment 80.For example, computer program 83, which can be divided into primary voice data shown in Fig. 4, obtains mould
Block 11, data preprocessing module 12, signal-to-noise ratio acquisition module 13 and speech database index establish module 14, each specific work(of module
It can not one by one be repeated herein as described in Example 2.Alternatively, computer program 83 can be divided into vocal print note shown in fig. 6
Volume acquisition request module 21, target index acquisition module 22, composite index acquisition module 23, registration voice data acquisition module 24
With registration vocal print acquisition module 25, each module concrete function is for example as described in Example 4, does not repeat one by one herein.
Terminal device 80 can be the computing devices such as desktop PC, notebook, palm PC and cloud server.Eventually
End equipment may include, but be not limited only to, processor 81, memory 82.It will be understood by those skilled in the art that Fig. 8 is only eventually
The example of end equipment 80 does not constitute the restriction to terminal device 80, may include components more more or fewer than diagram, or
Combine certain components or different components, for example, terminal device can also include input-output equipment, network access equipment,
Bus etc..
Alleged processor 81 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), application-specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor can also be any conventional processor
Deng.
Memory 82 can be the internal storage unit of terminal device 80, such as the hard disk or memory of terminal device 80.It deposits
Reservoir 82 can also be the plug-in type hard disk being equipped on the External memory equipment of terminal device 80, such as terminal device 80, intelligence
Storage card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, flash card (Flash Card)
Deng.Further, memory 82 can also both include terminal device 80 internal storage unit and also including External memory equipment.It deposits
Reservoir 82 is used to store other programs and the data needed for computer program and terminal device.Memory 82 can be also used for temporarily
When store the data that has exported or will export.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each work(
Can unit, module division progress for example, in practical application, can be as needed and by above-mentioned function distribution by different
Functional unit, module are completed, i.e., the internal structure of described device are divided into different functional units or module, more than completion
The all or part of function of description.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also
It is that each unit physically exists alone, it can also be during two or more units be integrated in one unit.Above-mentioned integrated list
The form that hardware had both may be used in member is realized, can also be realized in the form of SFU software functional unit.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or
In use, can be stored in a computer read/write memory medium.Based on this understanding, the present invention realizes above-mentioned implementation
All or part of flow in example method, can also instruct relevant hardware to complete, the meter by computer program
Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on
The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program generation
Code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer-readable medium
May include:Any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic of the computer program code can be carried
Dish, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM,
Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that described
The content that computer-readable medium includes can carry out increasing appropriate according to legislation in jurisdiction and the requirement of patent practice
Subtract, such as in certain jurisdictions, according to legislation and patent practice, computer-readable medium do not include be electric carrier signal and
Telecommunication signal.
Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations;Although with reference to aforementioned reality
Applying example, invention is explained in detail, it will be understood by those of ordinary skill in the art that:It still can be to aforementioned each
Technical solution recorded in embodiment is modified or equivalent replacement of some of the technical features;And these are changed
Or replace, the spirit and scope for various embodiments of the present invention technical solution that it does not separate the essence of the corresponding technical solution should all
It is included within protection scope of the present invention.
Claims (10)
1. a kind of voice data base establishing method, which is characterized in that including:
Primary voice data is obtained, the primary voice data includes original user mark and voice collecting time;
The primary voice data is pre-processed, efficient voice data are obtained;
Obtain the corresponding signal-to-noise ratio of the efficient voice data;
The efficient voice data are stored in speech database, and are the efficient voice number in the speech database
It is indexed according to establishing, the index includes original user mark, voice collecting time and signal-to-noise ratio.
2. voice data base establishing method as described in claim 1, which is characterized in that it is described to the primary voice data into
Row pretreatment, obtains efficient voice data, specifically includes:
Corresponding primary voice data is identified to each original user and is filtered processing and mute removal processing, obtains effective language
Sound data.
3. voice data base establishing method as claimed in claim 2, which is characterized in that described to each original user mark pair
The primary voice data answered is filtered processing, specifically includes:
Extract the vocal print feature that same original user identifies corresponding primary voice data;
Based on the vocal print feature, same original user is identified into corresponding primary voice data and uses k-means clustering algorithms
Clustering is carried out, target's center's point is obtained;
Using distance algorithm, calculates same original user and identify corresponding each primary voice data and target's center's point
Distance;
Remove same original user identify in corresponding each primary voice data be more than at a distance from target's center's point away from
Primary voice data from threshold value.
4. a kind of voiceprint registration method, which is characterized in that including:
Voiceprint registration request is obtained, the voiceprint registration request includes registration user identifier and current time;
Based on the registration user identifier voice inquirement database, the original user to match with the registration user identifier is obtained
Corresponding target index is identified, the speech database is created using claim 1-3 any one of them speech databases
The speech database that method creates;
The voice collecting time indexed according to the current time, the target and signal-to-noise ratio obtain each target index
Corresponding composite index;
It chooses the highest target of composite index and indexes corresponding efficient voice data, as registration voice data;
Based on the registration voice data, corresponding vocal print feature is obtained as registration vocal print.
5. voiceprint registration method as claimed in claim 4, which is characterized in that described according to the current time, the target
The voice collecting time of index and signal-to-noise ratio obtain each target and index corresponding composite index, specifically include:
The voice collecting time indexed according to the current time, the target and signal-to-noise ratio, using composite index calculation formula
It calculates each target and indexes corresponding composite index;
The composite index calculation formula is:
Composite index=a* signal-to-noise ratio+(1-a) * [1/ (current time-voice collecting time)];
Wherein, a is default weight, and 0≤a≤1.
6. voiceprint registration method as claimed in claim 4, which is characterized in that described to inquire language based on the registration user identifier
Sound database further includes:
If there is no the original users to match with the registration user identifier to identify in the speech database, voice is sent
Recording request;
It obtains the voice recording and asks corresponding voice recording data;
From the corresponding vocal print feature of the voice recording extracting data as registration vocal print.
7. a kind of speech database creating device, which is characterized in that including:
Primary voice data acquisition module, for obtaining primary voice data, the primary voice data includes original user mark
Know and the voice collecting time;
Data preprocessing module obtains efficient voice data for being pre-processed to the primary voice data;
Signal-to-noise ratio acquisition module, for obtaining the corresponding signal-to-noise ratio of the efficient voice data;
Speech database index establishes module, for the efficient voice data to be stored in speech database, and is described
The efficient voice data in speech database establish index, and the index includes original user mark, voice collecting time
And signal-to-noise ratio.
8. a kind of voiceprint registration device, which is characterized in that including:
Voiceprint registration acquisition request module, for obtaining voiceprint registration request, the voiceprint registration request includes registration user's mark
Knowledge and current time;
Target indexes acquisition module, for being based on the registration user identifier voice inquirement database, obtains and is used with the registration
Family identifies corresponding original user and identifies corresponding target index, and the speech database is any using claim 1-3
The speech database that the item voice data base establishing method creates;
Composite index acquisition module, the voice collecting time for being indexed according to the current time, the target and signal-to-noise ratio,
It obtains each target and indexes corresponding composite index;
Voice data acquisition module is registered, corresponding efficient voice data is indexed for choosing the highest target of composite index, makees
To register voice data;
Vocal print acquisition module is registered, for being based on the registration voice data, obtains corresponding vocal print feature as registration vocal print.
9. a kind of terminal device, including memory, processor and it is stored in the memory and can be on the processor
The computer program of operation, which is characterized in that the processor realizes such as claims 1 to 3 when executing the computer program
The step of any one of them voice data base establishing method;Alternatively, the processor is realized when executing the computer program
Such as the step of claim 4 to 6 any one of them voiceprint registration method.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, feature to exist
In the computer program realizes speech database establishment side as described in any one of claims 1 to 3 when being executed by processor
The step of method;Alternatively, realizing such as claim 4 to 6 any one of them vocal print when the computer program is executed by processor
The step of register method.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810031164.9A CN108460081B (en) | 2018-01-12 | 2018-01-12 | Voice data base establishing method, voiceprint registration method, apparatus, equipment and medium |
PCT/CN2018/077234 WO2019136801A1 (en) | 2018-01-12 | 2018-02-26 | Voice database creation method, voiceprint registration method, device, equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810031164.9A CN108460081B (en) | 2018-01-12 | 2018-01-12 | Voice data base establishing method, voiceprint registration method, apparatus, equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108460081A true CN108460081A (en) | 2018-08-28 |
CN108460081B CN108460081B (en) | 2019-07-12 |
Family
ID=63221350
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810031164.9A Active CN108460081B (en) | 2018-01-12 | 2018-01-12 | Voice data base establishing method, voiceprint registration method, apparatus, equipment and medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN108460081B (en) |
WO (1) | WO2019136801A1 (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109065056A (en) * | 2018-09-26 | 2018-12-21 | 珠海格力电器股份有限公司 | A kind of method and device of voice control air-conditioning |
CN109727602A (en) * | 2018-12-29 | 2019-05-07 | 苏州思必驰信息科技有限公司 | A kind of method for recognizing sound-groove and device of mobile device terminal |
CN110648671A (en) * | 2019-08-21 | 2020-01-03 | 广州国音智能科技有限公司 | Voiceprint model reconstruction method, terminal, device and readable storage medium |
CN110689894A (en) * | 2019-08-15 | 2020-01-14 | 深圳市声扬科技有限公司 | Automatic registration method and device and intelligent equipment |
CN110738524A (en) * | 2019-10-15 | 2020-01-31 | 上海云从企业发展有限公司 | service data management method, system, equipment and medium |
CN110782902A (en) * | 2019-11-06 | 2020-02-11 | 北京远鉴信息技术有限公司 | Audio data determination method, apparatus, device and medium |
CN110875043A (en) * | 2019-11-11 | 2020-03-10 | 广州国音智能科技有限公司 | Voiceprint recognition method and device, mobile terminal and computer readable storage medium |
CN111128198A (en) * | 2019-12-25 | 2020-05-08 | 厦门快商通科技股份有限公司 | Voiceprint recognition method, voiceprint recognition device, storage medium, server and voiceprint recognition system |
CN111243601A (en) * | 2019-12-31 | 2020-06-05 | 北京捷通华声科技股份有限公司 | Voiceprint clustering method and device, electronic equipment and computer-readable storage medium |
CN111415669A (en) * | 2020-04-15 | 2020-07-14 | 厦门快商通科技股份有限公司 | Voiceprint model construction method, device and equipment |
CN111856399A (en) * | 2019-04-26 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Positioning identification method and device based on sound, electronic equipment and storage medium |
WO2021052306A1 (en) * | 2019-09-19 | 2021-03-25 | 北京三快在线科技有限公司 | Voiceprint feature registration |
CN112992181A (en) * | 2021-02-08 | 2021-06-18 | 上海哔哩哔哩科技有限公司 | Audio classification method and device |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112258220A (en) * | 2020-10-12 | 2021-01-22 | 北京豆牛网络科技有限公司 | Information acquisition and analysis method, system, electronic device and computer readable medium |
WO2024049311A1 (en) * | 2022-08-30 | 2024-03-07 | Biometriq Sp. Z O.O. | Method of selecting the optimal voiceprint |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070071206A1 (en) * | 2005-06-24 | 2007-03-29 | Gainsboro Jay L | Multi-party conversation analyzer & logger |
CN102509547A (en) * | 2011-12-29 | 2012-06-20 | 辽宁工业大学 | Method and system for voiceprint recognition based on vector quantization based |
CN106095799A (en) * | 2016-05-30 | 2016-11-09 | 广州多益网络股份有限公司 | The storage of a kind of voice, search method and device |
CN106782564A (en) * | 2016-11-18 | 2017-05-31 | 百度在线网络技术(北京)有限公司 | Method and apparatus for processing speech data |
-
2018
- 2018-01-12 CN CN201810031164.9A patent/CN108460081B/en active Active
- 2018-02-26 WO PCT/CN2018/077234 patent/WO2019136801A1/en active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070071206A1 (en) * | 2005-06-24 | 2007-03-29 | Gainsboro Jay L | Multi-party conversation analyzer & logger |
CN102509547A (en) * | 2011-12-29 | 2012-06-20 | 辽宁工业大学 | Method and system for voiceprint recognition based on vector quantization based |
CN106095799A (en) * | 2016-05-30 | 2016-11-09 | 广州多益网络股份有限公司 | The storage of a kind of voice, search method and device |
CN106782564A (en) * | 2016-11-18 | 2017-05-31 | 百度在线网络技术(北京)有限公司 | Method and apparatus for processing speech data |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109065056A (en) * | 2018-09-26 | 2018-12-21 | 珠海格力电器股份有限公司 | A kind of method and device of voice control air-conditioning |
CN109065056B (en) * | 2018-09-26 | 2021-05-11 | 珠海格力电器股份有限公司 | Method and device for controlling air conditioner through voice |
CN109727602A (en) * | 2018-12-29 | 2019-05-07 | 苏州思必驰信息科技有限公司 | A kind of method for recognizing sound-groove and device of mobile device terminal |
CN111856399A (en) * | 2019-04-26 | 2020-10-30 | 北京嘀嘀无限科技发展有限公司 | Positioning identification method and device based on sound, electronic equipment and storage medium |
CN110689894A (en) * | 2019-08-15 | 2020-01-14 | 深圳市声扬科技有限公司 | Automatic registration method and device and intelligent equipment |
CN110689894B (en) * | 2019-08-15 | 2022-03-29 | 深圳市声扬科技有限公司 | Automatic registration method and device and intelligent equipment |
CN110648671A (en) * | 2019-08-21 | 2020-01-03 | 广州国音智能科技有限公司 | Voiceprint model reconstruction method, terminal, device and readable storage medium |
WO2021052306A1 (en) * | 2019-09-19 | 2021-03-25 | 北京三快在线科技有限公司 | Voiceprint feature registration |
CN110738524A (en) * | 2019-10-15 | 2020-01-31 | 上海云从企业发展有限公司 | service data management method, system, equipment and medium |
CN110782902A (en) * | 2019-11-06 | 2020-02-11 | 北京远鉴信息技术有限公司 | Audio data determination method, apparatus, device and medium |
CN110875043A (en) * | 2019-11-11 | 2020-03-10 | 广州国音智能科技有限公司 | Voiceprint recognition method and device, mobile terminal and computer readable storage medium |
CN110875043B (en) * | 2019-11-11 | 2022-06-17 | 广州国音智能科技有限公司 | Voiceprint recognition method and device, mobile terminal and computer readable storage medium |
CN111128198A (en) * | 2019-12-25 | 2020-05-08 | 厦门快商通科技股份有限公司 | Voiceprint recognition method, voiceprint recognition device, storage medium, server and voiceprint recognition system |
CN111243601A (en) * | 2019-12-31 | 2020-06-05 | 北京捷通华声科技股份有限公司 | Voiceprint clustering method and device, electronic equipment and computer-readable storage medium |
CN111243601B (en) * | 2019-12-31 | 2023-04-07 | 北京捷通华声科技股份有限公司 | Voiceprint clustering method and device, electronic equipment and computer-readable storage medium |
CN111415669A (en) * | 2020-04-15 | 2020-07-14 | 厦门快商通科技股份有限公司 | Voiceprint model construction method, device and equipment |
CN112992181A (en) * | 2021-02-08 | 2021-06-18 | 上海哔哩哔哩科技有限公司 | Audio classification method and device |
Also Published As
Publication number | Publication date |
---|---|
WO2019136801A1 (en) | 2019-07-18 |
CN108460081B (en) | 2019-07-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108460081B (en) | Voice data base establishing method, voiceprint registration method, apparatus, equipment and medium | |
CN110310623B (en) | Sample generation method, model training method, device, medium, and electronic apparatus | |
CN102509547B (en) | Method and system for voiceprint recognition based on vector quantization based | |
WO2019037205A1 (en) | Voice fraud identifying method and apparatus, terminal device, and storage medium | |
CN109243465A (en) | Voiceprint authentication method, device, computer equipment and storage medium | |
CN110265040A (en) | Training method, device, storage medium and the electronic equipment of sound-groove model | |
CN107680582A (en) | Acoustic training model method, audio recognition method, device, equipment and medium | |
CN107610708B (en) | Identify the method and apparatus of vocal print | |
CN109215665A (en) | A kind of method for recognizing sound-groove based on 3D convolutional neural networks | |
JPH05216490A (en) | Apparatus and method for speech coding and apparatus and method for speech recognition | |
CN106128465A (en) | A kind of Voiceprint Recognition System and method | |
CN112259106A (en) | Voiceprint recognition method and device, storage medium and computer equipment | |
CN113436612B (en) | Intention recognition method, device, equipment and storage medium based on voice data | |
CN113223536B (en) | Voiceprint recognition method and device and terminal equipment | |
CN109036437A (en) | Accents recognition method, apparatus, computer installation and computer readable storage medium | |
CN108269575A (en) | Update audio recognition method, terminal installation and the storage medium of voice print database | |
CN110428853A (en) | Voice activity detection method, Voice activity detection device and electronic equipment | |
CN110400567A (en) | Register vocal print dynamic updating method and computer storage medium | |
CN113782032A (en) | Voiceprint recognition method and related device | |
JPH09507921A (en) | Speech recognition system using neural network and method of using the same | |
Nijhawan et al. | Speaker recognition using support vector machine | |
CN109817196A (en) | A kind of method of canceling noise, device, system, equipment and storage medium | |
CN115631748A (en) | Emotion recognition method and device based on voice conversation, electronic equipment and medium | |
Renisha et al. | Cascaded Feedforward Neural Networks for speaker identification using Perceptual Wavelet based Cepstral Coefficients | |
CN112967734B (en) | Music data identification method, device, equipment and storage medium based on multiple sound parts |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |