CN108010533A - The automatic identifying method and device of voice data code check - Google Patents
The automatic identifying method and device of voice data code check Download PDFInfo
- Publication number
- CN108010533A CN108010533A CN201610957146.4A CN201610957146A CN108010533A CN 108010533 A CN108010533 A CN 108010533A CN 201610957146 A CN201610957146 A CN 201610957146A CN 108010533 A CN108010533 A CN 108010533A
- Authority
- CN
- China
- Prior art keywords
- code check
- target class
- voice data
- class code
- labeled
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Abstract
The present invention relates to the automatic identifying method and device of voice data code check.The described method includes:According to automatic identification training pattern, voice data to be predicted is labeled, obtains the labeled data with target class code check form and the labeled data with non-target class code check form;By the probability occurred with the labeled data of target class code check form compared with pre-set threshold probability, if the probability that the labeled data with target class code check form occurs is more than or equal to pre-set threshold probability, labeled data of the output with target class code check form.The embodiment of the present invention is labeled voice data to be predicted, obtains the labeled data with target class code check form and the labeled data with non-target class code check form according to automatic identification training pattern;And by the probability occurred with the labeled data of target class code check form compared with pre-set threshold probability, realize the process that different voice data code checks are carried out with automatic identification.
Description
Technical field
The present invention relates to Audiotechnica field, and specifically, the present invention relates to the automatic identifying method of voice data code check
And device.
Background technology
At present, (MPEG-1or MPEG-2Audio Layer III, dynamic image expert group -1 or dynamic image are special by MP3
Family -2 audio layer III of group) it is current most popular a kind of digital audio encoding and lossy compression method form, it is designed to significantly
Reduce amount of audio data.MP3 is lossy compression method form, and the less music file of capacity, makes transmission and storage more convenient,
More conducively user uses, and therefore, MP3 is developed rapidly.One of important technology used in MP3 is human body acoustic model,
The technology has been given up to the unessential part of human auditory system in pulse code modulation voice data, so that digital audio file
Compressed.
According to different code checks, the audio file of MP3 format is compressed.Unit interval when code check is exactly data transfer
The data bits of transmission, code check, which represents that the video/audio after compressed encoding is per second, to be needed to be represented with how many a bits,
The unit that code check generally uses is kbps, i.e. kilobit is per second.Based on the correspondence between size of data and tonequality, mainstream code check
Including 320kbps, 256kbps, 224kbps, 192kbps, 128kbps, 96kbps, 64kbps.However, as music format turns
The popularization of software is changed, the false high code check digital music largely converted by low bit- rate occurs in the market, this false high
Code check digital music cause the actual musical qualities enjoyed of user with expect it is inconsistent, reduce user experience.
At present, for digital music service provider, the recognition methods of audio code rate is mainly the different sound of manual identified
Frequency code rate.But the manual identified of audio code rate not only needs to consume substantial amounts of human cost, but also inefficiency, identification
Accuracy rate is low, it is difficult to carries out quality monitoring to the identification quality of the manual identified of audio code rate, therefore, it is necessary to a kind of voice data
The automatic identifying method of code check, realizes and carries out automatic identification to the code check of different voice datas.
The content of the invention
The embodiment of the present invention is the automatic identifying method and device for providing voice data code check, passes through the sound to collecting
Frequency obtains the automatic identification training pattern of voice data code check according to model training is carried out;It is right according to automatic identification training pattern
Voice data to be predicted is labeled, and obtains labeled data with target class code check form and with non-target class code check form
Labeled data, so as to fulfill to the code check of different voice datas carry out automatic identification process.
In a first aspect, an embodiment of the present invention provides the automatic identifying method of voice data code check, the described method includes:
By carrying out model training to the voice data collected, the automatic identification training of the voice data code check is obtained
Model;
According to the automatic identification training pattern, voice data to be predicted is labeled, acquisition has target class code check
The labeled data of form and the labeled data with non-target class code check form;
The probability that the labeled data with target class code check form occurs and pre-set threshold probability are carried out
Compare, if the probability that the labeled data with target class code check form occurs is more than or equal to pre-set threshold probability,
The then output labeled data with target class code check form.
Preferably, it is described by carrying out model training to the voice data collected, obtain the voice data code check
Automatic identification training pattern specifically includes:
The voice data is labeled, to generate the training sample of the labeled data with the target class code check form
This;
Sonograph conversion is carried out to the voice data of the labeled data with the target class code check form, is obtained corresponding
Sonograph;
Picture scaling is carried out to the sonograph, obtains corresponding thumbnail;
Model training is carried out to the view data of the thumbnail using convolutional neural networks algorithm, obtains corresponding audio
The training pattern of the automatic identification of data bit rate.
Preferably, the target class code check is the target class code check of MP3 format, and the target class code check of the MP3 format
Specifically include the code check of following 320kbps, the code check of 256kbps, the code check of 224kbps, the code check of 192kbps, 128kbps
Any code check in the code check of code check, the code check of 96kbps and 64kbps.
Preferably, the non-target class code check is the target class code check of MP3 format, and the non-target class of the MP3 format
Code check specifically includes following remaining whole code checks different from the target class code check of the MP3 format.
Preferably, by bilinear interpolation, picture scaling is carried out to the sonograph, obtains corresponding thumbnail.
Preferably, by bilinear interpolation, using AlexNet convolutional neural networks model as training pattern, to institute
The view data for stating thumbnail carries out model training, obtains the training pattern of the automatic identification of corresponding voice data code check.
Preferably, the AlexNet convolutional neural networks model specifically includes 1 input layer, 5 convolutional layers, 3 ponds
Layer, 2 full articulamentums and 1 output layer.
Preferably, the automatic identification training pattern is deployed to digital music storage server cluster, with to be predicted
Voice data is labeled.
Preferably, using cpu model, the automatic identification training pattern is deployed to digital music storage server collection
Group.
Second aspect, an embodiment of the present invention provides the automatic identification equipment of voice data code check, described device includes:
Training pattern acquisition module, by carrying out model training to the voice data collected, obtains the voice data
The automatic identification training pattern of code check;
Labeled data acquisition module, according to the automatic identification training pattern, is labeled voice data to be predicted, obtains
Obtain the labeled data with target class code check form and the labeled data with non-target class code check form;
Comparison module, the probability that the labeled data with target class code check form is occurred and pre-set threshold value
Probability is compared, if the probability that the labeled data with target class code check form occurs is more than or equal to pre-set threshold
It is worth probability, then the output labeled data with target class code check form.
An embodiment of the present invention provides the automatic identifying method of voice data code check, by the voice data to collecting into
Row model training, obtains the automatic identification training pattern of voice data code check;According to automatic identification training pattern, to sound to be predicted
Frequency obtains the labeled data with target class code check form and the mark number with non-target class code check form according to being labeled
According to;By the probability occurred with the labeled data of target class code check form compared with pre-set threshold probability, if tool
The probability that the labeled data for having target class code check form occurs is more than or equal to pre-set threshold probability, then output has target
The labeled data of class code check form, so as to fulfill different voice data code checks are carried out with the process of automatic identification.The present invention is implemented
Example by the probability that will occur with the labeled data of target class code check form compared with pre-set threshold probability, if
The probability that labeled data with target class code check form occurs is more than or equal to pre-set threshold probability, then output has mesh
The labeled data of class code check form is marked, so as to fulfill different voice data code checks are carried out with the process of automatic identification.
Brief description of the drawings
Fig. 1 is the automatic identifying method flow chart of voice data code check provided in an embodiment of the present invention;
Fig. 2 is the automatic identification equipment structure diagram of voice data code check provided in an embodiment of the present invention.
Embodiment
To make the purpose, technical scheme and advantage of the embodiment of the present invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, the technical solution in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
Part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art
All other embodiments obtained without making creative work, belong to the scope of protection of the invention.
For ease of the understanding to the embodiment of the present invention, it is further explained below in conjunction with attached drawing with specific embodiment
It is bright.
In technical solution provided by the present invention, by carrying out model training to the voice data collected, audio is obtained
The automatic identification training pattern of data bit rate;According to automatic identification training pattern, voice data to be predicted is labeled, is obtained
Labeled data with target class code check form and the labeled data with non-target class code check form;There to be target class code check
The probability that the labeled data of form occurs is compared with pre-set threshold probability, if the mark with target class code check form
The probability that note data occur is more than or equal to pre-set threshold probability, then mark number of the output with target class code check form
According to it is achieved thereby that different voice data code checks are carried out with the process of automatic identification.
The technical solution that the invention will now be described in detail with reference to the accompanying drawings.
The automatic identifying method flow chart of voice data code check provided in an embodiment of the present invention, as shown in Figure 1, voice data
The automatic identifying method of code check includes the following steps:
S101:By carrying out model training to the voice data collected, the automatic identification instruction of voice data code check is obtained
Practice model.
Specifically, by carrying out model training to the voice data collected, the automatic knowledge of voice data code check is obtained
Other training pattern specifically comprises the following steps:
Voice data is labeled, to generate the training sample of the labeled data with target class code check form.
In order to ensure the accuracy of the automatic identification training pattern obtained by sample training, in the specific embodiment of the invention
Used voice data is specially lossless music compression generation low bit- rate music file.
Further, it is described in detail below to the preprocessing process of voice data:Rail generation WAV forms are grabbed to high tone quality CD
Digital music file;By the digital music file of obtained WAV forms be transcoded into 320kbp code checks, 256kbp code checks,
224kbp code checks, 192kbp code checks, 128kbp code checks, 96kbp code checks, the MP3 format of each code check of 64kbp code checks;Will
The MP3 of 320kbp code checks is as positive sample, and the MP3 of remaining six kinds of code check is as negative sample.
Sonograph conversion is carried out to the voice data of the labeled data with target class code check form, obtains corresponding sound spectrum
Figure.
It should be noted that since sonograph can characterize time of sound, frequency, energy information at the same time.In order to ensure
The integrality of audio data information expression, in a specific embodiment of the present invention, using the corresponding sonograph of voice data as volume
The input data of product neural network algorithm.
Short Time Fourier Transform is the conventional means of spectrum analysis.Change compared to Fourier, Short Time Fourier Transform is drawn
Window function is entered, the information that frequency signal changes over time can be provided.The sonograph finally obtained characterizes the time with abscissa,
Ordinate characterization frequency, characterization energy size, wherein, the energy characterization of sonograph uses RGB color model.
In a specific embodiment of the present invention, the energy characterization of sonograph is in addition to using RGB color model, sound
The energy characterization of spectrogram can also use the energy characterization mode of gray scale sonograph.
In order to ensure the accuracy of the automatic identification of voice data code check, to the labeled data with target class code check form
Voice data carry out sonograph conversion, the process for obtaining corresponding sonograph is described in detail below:
Picture scaling is carried out to sonograph, obtains corresponding thumbnail.
It should be noted that due to using view data of the convolutional neural networks algorithm to thumbnail in the embodiment of the present invention
Model training is carried out, and since convolutional neural networks algorithm only receives the view data of fixed size, using convolution god
Model training is carried out before, it is necessary to the corresponding sonograph of each voice data to the view data of thumbnail through network algorithm
Size carries out specification.
In a specific embodiment of the present invention, by bilinear interpolation, picture scaling is carried out to sonograph, is obtained corresponding
Thumbnail.
Picture scaling is carried out to sonograph using bilinear interpolation, can not only take into account the Gao Lian of pixel in view data
Continuous property, but also the complexity of algorithm can be further improved, the thumbnail of the sonograph enabled to more approaches
In real sonograph.
Model training is carried out to the view data of thumbnail using convolutional neural networks algorithm, obtains corresponding voice data
The training pattern of the automatic identification of code check.
In a specific embodiment of the present invention, respectively to the data of tetra- kinds of sizes of 28*28,56*56,84*84,256*256
Collection has carried out model training, the results show:Image is bigger, the training mould of the automatic identification of obtained corresponding voice data code check
The accuracy rate of type is higher.Further, as a result also show:Picture is bigger, and the training speed of model training is slower.
In practical applications, it is often not high to the requirement of real-time of the automatic identification of voice data code check, according to 256*
256 picture size, has obtained the training pattern of the automatic identification of the voice data code check of high-accuracy.
Convolutional neural networks algorithm is a kind of feedforward neural network algorithm, which can be recognized with the vision of the approximate simulation mankind
Know process, had a wide range of applications in image real time transfer field.
Further, it is right using AlexNet convolutional neural networks model as training pattern by bilinear interpolation
The view data of thumbnail carries out model training, obtains the training pattern of the automatic identification of corresponding voice data code check.Wherein,
AlexNet convolutional neural networks models specifically include 1 input layer, 5 convolutional layers, 3 pond layers, 2 full articulamentums and 1
Output layer.
In a specific embodiment of the present invention, using AlexNet convolutional neural networks model as training pattern, align,
Negative sample is trained.
In a specific embodiment of the present invention, the code check to 320kbps, the code check of 256kbps, the code of 224kbps respectively
Rate, the code check of 192kbps, the code check of 128kbps, the MP3 format of the code check of the code check of 96kbps and 64kbps each code check
Data set has carried out model training, the results show:The recognition accuracy of the MP3 of the code check of 320kbps has reached 98.54%.
It should be noted that in a specific embodiment of the present invention, except the music data for MP3 format carries out multi-code
Outside rate automatic identification, the music data of WMA, AAC, OGG form carries out the automatic identification of multi code Rate of Chinese character.
It should be noted that in a specific embodiment of the present invention, using AlexNet convolutional neural networks model as instruction
The reason for practicing model is that the number of parameters of the model is about 60,000,000, is 12 times of GoogleNet models, the expression of the model
Ability is strong, easily gets more accurate features.
Further, AlexNet convolutional neural networks model additionally uses the technologies such as ReLU, LRN, Dropout, effectively slow
The problem of having solved activation primitive saturation, and the problem of model over-fitting, meanwhile, improve the operational performance of model.
Further, for acceleration model training process, during model training CUDA+GPU is employed to be accelerated, with
Shorten the training time of the training pattern for the automatic identification for obtaining voice data code check.
It should be noted that in a specific embodiment of the present invention, except being made using AlexNet convolutional neural networks model
Outside training pattern, other convolutional neural networks models such as LeNet, GoogleNet, VGG can also be used as training mould
Type, remaining these convolutional neural networks model as training pattern technical solution also the present invention specific embodiment protection
In scheme.
S102:According to automatic identification training pattern, voice data to be predicted is labeled, acquisition has target class code check
The labeled data of form and the labeled data with non-target class code check form.
It should be noted that target class code check is the target class code check of MP3 format, and the target class code check tool of MP3 format
Body includes the code check of following 320kbps, the code check of 256kbps, the code check of 224kbps, the code check of 192kbps, the code of 128kbps
Any code check in the code check of rate, the code check of 96kbps and 64kbps.
Non-target class code check be MP3 format target class code check, and the non-target class code check of MP3 format specifically include it is as follows
The whole code checks of remaining different from the target class code check of foregoing MP3 format.
S103:The probability occurred with the labeled data of target class code check form and pre-set threshold probability are carried out
Compare, it is defeated if the probability that the labeled data with target class code check form occurs is more than or equal to pre-set threshold probability
Go out the labeled data with target class code check form.
In addition, in a specific embodiment of the present invention, the automatic identifying method of voice data code check further includes:Will be certainly
Dynamic recognition training model is deployed to digital music storage server cluster, to be labeled to voice data to be predicted.
In a specific embodiment of the present invention, using GPU patterns, automatic identification training pattern is deployed to digital music and is deposited
Store up server cluster.
Specifically, single GPU cluster is deployed to using GPU patterns, digital music is moved to the GPU cluster carries out
Mark.
It is that arithmetic speed faster, for digital music mark task is related to substantial amounts of audio number using the advantages of GPU patterns
According to causing the difficulty of Data Migration, be that cost is excessive using the shortcomings that GPU patterns still.Based on voice data code check from
Dynamic identification requires low cost to the of less demanding of real-time, is not more preferably mode using GPU patterns.It is if it is required that high
Speed, the application scenarios of online service, it may be considered that single GPU cluster is deployed to using GPU patterns, digital music is moved
It is labeled to the GPU cluster.
In a specific embodiment of the present invention, using cpu model, automatic identification training pattern is deployed to digital music and is deposited
Store up server cluster.
Specifically, single CPU cluster is deployed to using cpu model, digital music is moved to the CPU cluster carries out
Mark.
Automatic identification based on voice data code check requires low cost to the of less demanding of real-time, using CPU moulds
Formula is more preferably mode.If for the application scenarios of batch processing voice data under line, it may be considered that disposed using cpu model
To single CPU cluster, digital music is moved to the CPU cluster and is labeled.
In a specific embodiment of the present invention, in addition to CPU cluster deployment way, GPU cluster deployment way, PC, mobile phone
Etc. disposing just in the scheme of the specific embodiment of the present invention for other hardware devices.
In conclusion the automatic identifying method of voice data code check provided in an embodiment of the present invention, by collecting
Voice data carries out model training, obtains the automatic identification training pattern of voice data code check;According to automatic identification training pattern,
Voice data to be predicted is labeled, obtains labeled data with target class code check form and with non-target class code check lattice
The labeled data of formula;The probability occurred with the labeled data of target class code check form and pre-set threshold probability are carried out
Compare, it is defeated if the probability that the labeled data with target class code check form occurs is more than or equal to pre-set threshold probability
Go out the labeled data with target class code check form, so as to fulfill different voice data code checks are carried out with the process of automatic identification.
As shown in Fig. 2, it is the internal structure of wireless device automatic positioning equipment in building provided in an embodiment of the present invention
Block diagram;As shown in Fig. 2, the automatic identification equipment for the voice data code check that the embodiment of the present invention is provided, including:Training pattern obtains
Modulus block 201, labeled data acquisition module 202 and comparison module 203.
Specifically, training pattern acquisition module, by carrying out model training to the voice data collected, obtains audio
The automatic identification training pattern of data bit rate.
Further, training pattern acquisition module is specifically used for:Voice data is labeled, there is target class with generation
The training sample of the labeled data of code check form;
Sonograph conversion is carried out to the voice data of the labeled data with target class code check form, obtains corresponding sound spectrum
Figure;
Picture scaling is carried out to sonograph, obtains corresponding thumbnail;
Further, training pattern acquisition module carries out picture scaling to sonograph, obtains phase by bilinear interpolation
The thumbnail answered.
Model training is carried out to the view data of thumbnail using convolutional neural networks algorithm, obtains corresponding voice data
The training pattern of the automatic identification of code check.
Further, training pattern acquisition module is by bilinear interpolation, using AlexNet convolutional neural networks models
As training pattern, model training is carried out to the view data of thumbnail, obtains the automatic identification of corresponding voice data code check
Training pattern.
Wherein, AlexNet convolutional neural networks models specifically include 1 input used by training pattern acquisition module
Layer, 5 convolutional layers, 3 pond layers, 2 full articulamentums and 1 output layer.
Labeled data acquisition module, according to automatic identification training pattern, is labeled voice data to be predicted, is had
There are the labeled data of target class code check form and the labeled data with non-target class code check form.
Wherein, the target class code check for the labeled data that labeled data acquisition module is got is the target class code of MP3 format
Rate, and the target class code check of MP3 format specifically includes the code check of following 320kbps, the code check of 256kbps, the code of 224kbps
Rate, the code check of 192kbps, the code check of 128kbps, the code check of 96kbps and 64kbps code check in any code check.
The non-target class code check for the labeled data that labeled data acquisition module is got is the target class code check of MP3 format,
And the non-target class code check of MP3 format specifically includes following remaining whole codes different from the target class code check of foregoing MP3 format
Rate.
Comparison module, by the probability occurred with the labeled data of target class code check form and pre-set threshold probability
It is compared, if the probability that the labeled data with target class code check form occurs is more than or equal to pre-set threshold probability,
Then labeled data of the output with target class code check form.
In addition, the automatic identification equipment of voice data code check further includes training pattern deployment module and (does not mark in fig. 2
Go out).
Training pattern deployment module, is deployed to digital music storage server cluster, with right by automatic identification training pattern
Voice data to be predicted is labeled.
Further, training pattern deployment module, using cpu model, digital sound is deployed to by automatic identification training pattern
Happy storage server cluster.
In technical scheme, by carrying out model training to the voice data collected, voice data code is obtained
The automatic identification training pattern of rate;According to automatic identification training pattern, voice data to be predicted is labeled, acquisition has mesh
Mark the labeled data of class code check form and the labeled data with non-target class code check form;By with target class code check form
The probability that labeled data occurs is compared with pre-set threshold probability, if the labeled data with target class code check form
The probability of appearance is more than or equal to pre-set threshold probability, then labeled data of the output with target class code check form, so that
Realize the process that different voice data code checks are carried out with automatic identification.
Above-described embodiment, has carried out the purpose of the present invention, technical solution and beneficial effect further
Describe in detail, it should be understood that the foregoing is merely the embodiment of the present invention, be not intended to limit the present invention
Protection domain, within the spirit and principles of the invention, any modification, equivalent substitution, improvement and etc. done, should all include
Within protection scope of the present invention.
Claims (10)
1. the automatic identifying method of voice data code check, it is characterised in that including:
By carrying out model training to the voice data collected, the automatic identification training mould of the voice data code check is obtained
Type;
According to the automatic identification training pattern, voice data to be predicted is labeled, acquisition has target class code check form
Labeled data and with non-target class code check form labeled data;
By the probability of the labeled data appearance with target class code check form compared with pre-set threshold probability,
If the probability that the labeled data with target class code check form occurs is more than or equal to pre-set threshold probability, export
The labeled data with target class code check form.
It is 2. according to the method described in claim 1, it is characterized in that, described by carrying out model instruction to the voice data collected
Practice, the automatic identification training pattern for obtaining the voice data code check specifically includes:
The voice data is labeled, to generate the training sample of the labeled data with the target class code check form;
Sonograph conversion is carried out to the voice data of the labeled data with the target class code check form, obtains corresponding sound spectrum
Figure;
Picture scaling is carried out to the sonograph, obtains corresponding thumbnail;
Model training is carried out to the view data of the thumbnail using convolutional neural networks algorithm, obtains corresponding voice data
The training pattern of the automatic identification of code check.
3. according to the method described in claim 1, it is characterized in that, the target class code check be MP3 format target class code check,
And the target class code check of the MP3 format specifically includes the code check of following 320kbps, the code check of 256kbps, the code of 224kbps
Rate, the code check of 192kbps, the code check of 128kbps, the code check of 96kbps and 64kbps code check in any code check.
4. according to the method described in claim 3, it is characterized in that, the non-target class code check is the target class code of MP3 format
Rate, and the non-target class code check of the MP3 format specifically includes remaining whole different from the target class code check of the MP3 format
Code check.
5. according to the method described in claim 2, it is characterized in that, by bilinear interpolation, figure is carried out to the sonograph
Piece scales, and obtains corresponding thumbnail.
6. according to the method described in claim 2, it is characterized in that, by bilinear interpolation, using AlexNet convolutional Neurals
Network model carries out model training to the view data of the thumbnail, obtains corresponding voice data code as training pattern
The training pattern of the automatic identification of rate.
7. according to the method described in claim 6, it is characterized in that, the AlexNet convolutional neural networks model specifically includes 1
A input layer, 5 convolutional layers, 3 pond layers, 2 full articulamentums and 1 output layer.
8. according to the method described in claim 1, it is characterized in that, the method further includes:The automatic identification is trained into mould
Type is deployed to digital music storage server cluster, to be labeled to voice data to be predicted.
9. according to the method described in claim 8, it is characterized in that, using cpu model, by the automatic identification training pattern portion
Affix one's name to digital music storage server cluster.
10. the automatic identification equipment of voice data code check, it is characterised in that including:
Training pattern acquisition module, by carrying out model training to the voice data collected, obtains the voice data code check
Automatic identification training pattern;
Labeled data acquisition module, according to the automatic identification training pattern, is labeled voice data to be predicted, is had
There are the labeled data of target class code check form and the labeled data with non-target class code check form;
Comparison module, the probability that the labeled data with target class code check form is occurred and pre-set threshold probability
It is compared, if the probability that the labeled data with target class code check form occurs is general more than or equal to pre-set threshold value
Rate, then have the labeled data of target class code check form described in output.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610957146.4A CN108010533A (en) | 2016-10-27 | 2016-10-27 | The automatic identifying method and device of voice data code check |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610957146.4A CN108010533A (en) | 2016-10-27 | 2016-10-27 | The automatic identifying method and device of voice data code check |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108010533A true CN108010533A (en) | 2018-05-08 |
Family
ID=62048392
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610957146.4A Pending CN108010533A (en) | 2016-10-27 | 2016-10-27 | The automatic identifying method and device of voice data code check |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108010533A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109036465A (en) * | 2018-06-28 | 2018-12-18 | 南京邮电大学 | Speech-emotion recognition method |
CN110807159A (en) * | 2019-10-30 | 2020-02-18 | 同盾控股有限公司 | Data marking method and device, storage medium and electronic equipment |
CN110992963A (en) * | 2019-12-10 | 2020-04-10 | 腾讯科技(深圳)有限公司 | Network communication method, device, computer equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102394065A (en) * | 2011-11-04 | 2012-03-28 | 中山大学 | Analysis method of digital audio fake quality WAVE file |
CN102413378A (en) * | 2011-11-02 | 2012-04-11 | 杭州电子科技大学 | Adaptive neural network-based lost packet recovery method in video transmission |
CN102903379A (en) * | 2012-09-14 | 2013-01-30 | 浪潮(北京)电子信息产业有限公司 | Method and device for detecting MP3 file authenticity |
CN103871405A (en) * | 2014-01-14 | 2014-06-18 | 中山大学 | AMR audio authenticating method |
CN104123935A (en) * | 2014-07-16 | 2014-10-29 | 武汉大学 | Double compression detection method towards MP3 (moving picture experts group audio Layer-3) digital audio file |
-
2016
- 2016-10-27 CN CN201610957146.4A patent/CN108010533A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102413378A (en) * | 2011-11-02 | 2012-04-11 | 杭州电子科技大学 | Adaptive neural network-based lost packet recovery method in video transmission |
CN102394065A (en) * | 2011-11-04 | 2012-03-28 | 中山大学 | Analysis method of digital audio fake quality WAVE file |
CN102903379A (en) * | 2012-09-14 | 2013-01-30 | 浪潮(北京)电子信息产业有限公司 | Method and device for detecting MP3 file authenticity |
CN103871405A (en) * | 2014-01-14 | 2014-06-18 | 中山大学 | AMR audio authenticating method |
CN104123935A (en) * | 2014-07-16 | 2014-10-29 | 武汉大学 | Double compression detection method towards MP3 (moving picture experts group audio Layer-3) digital audio file |
Non-Patent Citations (2)
Title |
---|
DANIEL SEICHTER等: ""AAC encoding detection and bitrate estimation using a convolutional neural network"", 《2016 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)》 * |
高冲红 等: ""基于CNN的录音设备判别研究"", 《信息化研究》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109036465A (en) * | 2018-06-28 | 2018-12-18 | 南京邮电大学 | Speech-emotion recognition method |
CN109036465B (en) * | 2018-06-28 | 2021-05-11 | 南京邮电大学 | Speech emotion recognition method |
CN110807159A (en) * | 2019-10-30 | 2020-02-18 | 同盾控股有限公司 | Data marking method and device, storage medium and electronic equipment |
CN110807159B (en) * | 2019-10-30 | 2021-05-11 | 同盾控股有限公司 | Data marking method and device, storage medium and electronic equipment |
CN110992963A (en) * | 2019-12-10 | 2020-04-10 | 腾讯科技(深圳)有限公司 | Network communication method, device, computer equipment and storage medium |
CN110992963B (en) * | 2019-12-10 | 2023-09-29 | 腾讯科技(深圳)有限公司 | Network communication method, device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104347067B (en) | Audio signal classification method and device | |
CN1185626C (en) | System and method for modifying speech signals | |
CN103026407B (en) | Bandwidth extender | |
CN110223705A (en) | Phonetics transfer method, device, equipment and readable storage medium storing program for executing | |
CN108053836A (en) | A kind of audio automation mask method based on deep learning | |
CN111696580B (en) | Voice detection method and device, electronic equipment and storage medium | |
CN105321525A (en) | System and method for reducing VOIP (voice over internet protocol) communication resource overhead | |
CN108922513A (en) | Speech differentiation method, apparatus, computer equipment and storage medium | |
CN101599271A (en) | A kind of recognition methods of digital music emotion | |
CN110047510A (en) | Audio identification methods, device, computer equipment and storage medium | |
WO2011128723A1 (en) | Audio communication device, method for outputting an audio signal, and communication system | |
CN111508469A (en) | Text-to-speech conversion method and device | |
CN108206027A (en) | A kind of audio quality evaluation method and system | |
CN107895571A (en) | Lossless audio file identification method and device | |
WO2023116660A2 (en) | Model training and tone conversion method and apparatus, device, and medium | |
CN108010533A (en) | The automatic identifying method and device of voice data code check | |
CN113129927B (en) | Voice emotion recognition method, device, equipment and storage medium | |
CN106375780A (en) | Method and apparatus for generating multimedia file | |
CN104064191B (en) | Sound mixing method and device | |
CN107293306A (en) | A kind of appraisal procedure of the Objective speech quality based on output | |
CN110931045A (en) | Audio feature generation method based on convolutional neural network | |
CN1049062C (en) | Method of converting speech | |
CN109036470A (en) | Speech differentiation method, apparatus, computer equipment and storage medium | |
CN114338623B (en) | Audio processing method, device, equipment and medium | |
CN106233112A (en) | Coding method and equipment and signal decoding method and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180508 |