CN105513610A - Voice analysis method and device - Google Patents
Voice analysis method and device Download PDFInfo
- Publication number
- CN105513610A CN105513610A CN201510819750.6A CN201510819750A CN105513610A CN 105513610 A CN105513610 A CN 105513610A CN 201510819750 A CN201510819750 A CN 201510819750A CN 105513610 A CN105513610 A CN 105513610A
- Authority
- CN
- China
- Prior art keywords
- neural network
- signal
- different compression
- training
- phonetic feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 29
- 230000006835 compression Effects 0.000 claims abstract description 43
- 238000007906 compression Methods 0.000 claims abstract description 43
- 238000013528 artificial neural network Methods 0.000 claims abstract description 35
- 238000012549 training Methods 0.000 claims abstract description 32
- 238000012360 testing method Methods 0.000 claims abstract description 22
- 238000000034 method Methods 0.000 claims abstract description 15
- 238000005070 sampling Methods 0.000 claims abstract description 11
- 239000000284 extract Substances 0.000 claims description 11
- 230000006872 improvement Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000004064 recycling Methods 0.000 description 3
- 230000008878 coupling Effects 0.000 description 2
- 238000010168 coupling process Methods 0.000 description 2
- 238000005859 coupling reaction Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000019771 cognition Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 238000012797 qualification Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 239000010409 thin film Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Electrophonic Musical Instruments (AREA)
Abstract
One embodiment of the invention discloses a voice analysis method and device and relates to the technical field of voice identification, increasing the accuracy of identifying a source device of an audio file at a relatively low cost. The method comprises following steps: collected acoustical signals are processed with different compression algorithms and with the same sampling rate and bit rate to obtain audio files corresponding to different compression algorithms according to the collected acoustical signals; voiceless segments are extracted from the audio files corresponding to different compression algorithms to obtain voice characteristic signals based on the voiceless segments; a BP neural network is trained with the voice characteristic signals as training data, test signals are analyzed by the BP neural network which completes the training, and a voice recording device generating the test signals is identified. The invention is applicable for identifying source devices of audio files.
Description
Technical field
The present invention relates to voice recognition technology field, particularly relate to a kind of sound analysis method and device.
Background technology
Universal along with each class of electronic devices, sound pick-up outfit widespread use to field.Especially, in practice that is judicial, that enforce the law, the collection of audio file becomes a kind of important means of investigation and evidence collection.But again due to the easy forgery of audio file, the problems such as case scenario reduction ability is low, make audio file can only be as a reference many times.
Audio file by which kind of equipment is recorded in reflects recording occasion and sight, to a certain extent for judging whether audio file can be very important as vaild evidence.But, carry out effective differentiation of sound pick-up outfit at present for audio file, mainly still judged by the experience of personnel in charge of the case, accuracy rate is difficult to ensure, and the cost of the voiceprint analysis equipment of specialty is very high, the expense of carrying out sound identification and analysis remains high.As can be seen here, at present for the identification of the source device of audio file, difficulty is high and accuracy rate is lower, and the voiceprint analysis of specialty qualification cost very high, be difficult to basic unit law enforcement, judicial in popularize in a large number.
Summary of the invention
Embodiments of the invention provide a kind of sound analysis method and device, can improve the accuracy rate of the source device identification of audio file with lower cost.
For achieving the above object, embodiments of the invention adopt following technical scheme:
First aspect, embodiments of the invention provide a kind of sound analysis method, comprising:
By the voice signal gathered, obtain according to gathered voice signal the audio file distinguishing corresponding different compression algorithms with bit rate with identical sampling rate by different compression algorithms;
From the audio file of the different compression algorithm of correspondence, extract unvoiced segments, and obtain phonetic feature signal according to extracted unvoiced segments;
Utilize described phonetic feature signal to train BP (BackPropagation, multilayer feedforward) neural network as training data, and by completing the BP analysis of neural network test signal of training, identify the sound pick-up outfit generating described test signal.
Second aspect, embodiments of the invention provide a kind of sound analysis device, comprising: the main system control module connected by bus each other, voice recording-reproducing assembly, TFT touch screen module, compression algorithm realize module, memory module and upper computer module;
Described voice recording-reproducing assembly, for the signal that plays sound;
Described compression algorithm realizes module, for obtaining according to gathered voice signal the audio file distinguishing corresponding different compression algorithms with bit rate with identical sampling rate by different compression algorithms;
Described memory module, for storing the audio file of the different compression algorithm of described correspondence;
Described upper computer module, for extracting unvoiced segments in the audio file from the different compression algorithm of correspondence, and obtains phonetic feature signal according to extracted unvoiced segments; And utilize described phonetic feature signal as training data training BP neural network, and by completing the BP analysis of neural network test signal of training, identify the sound pick-up outfit generating described test signal.
The sound analysis method that the embodiment of the present invention provides and device, for adopting different compression algorithm with identical sampling rate and bit rate according to gathered voice signal, extract recording unvoiced segments also asks improvement respectively MFCC parameter to it, corresponding MFCC characteristic parameter is obtained by the audio file of different baud rate input Matlab, recycling MFCC characteristic parameter is trained BP neural network, with the BP neural network classification phonetic feature signal trained, according to classification results identification sound pick-up outfit, the equipment cost used due to the present invention such as STM32 and Matlab is cheap, therefore the accuracy rate of the source device identification improving audio file with lower cost is achieved.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme in the embodiment of the present invention, be briefly described to the accompanying drawing used required in embodiment below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
The process flow diagram of the sound analysis method that Fig. 1 provides for the embodiment of the present invention;
The concrete device schematic diagram of the sound analysis method that Fig. 2 provides for the execution embodiment of the present invention;
The schematic flow sheet of the unvoiced segments extraction scheme that Fig. 3 provides for the embodiment of the present invention;
The schematic flow sheet of the improvement MFCC parameter extraction scheme that Fig. 4 provides for the embodiment of the present invention;
The schematic flow sheet of the phonetic feature signal sorting algorithm based on BP neural network that Fig. 5 provides for the embodiment of the present invention;
The schematic flow sheet of the sound pick-up outfit identifying schemes that Fig. 6 provides for the embodiment of the present invention;
The structural representation of the sound analysis device that Fig. 7 provides for the embodiment of the present invention.
Embodiment
For making those skilled in the art understand technical scheme of the present invention better, below in conjunction with the drawings and specific embodiments, the present invention is described in further detail.Hereinafter will describe embodiments of the present invention in detail, the example of described embodiment is shown in the drawings, and wherein same or similar label represents same or similar element or has element that is identical or similar functions from start to finish.Being exemplary below by the embodiment be described with reference to the drawings, only for explaining the present invention, and can not limitation of the present invention being interpreted as.
Those skilled in the art of the present technique are appreciated that unless expressly stated, and singulative used herein " ", " one ", " described " and " being somebody's turn to do " also can comprise plural form.Should be further understood that, the wording used in instructions of the present invention " comprises " and refers to there is described feature, integer, step, operation, element and/or assembly, but does not get rid of and exist or add other features one or more, integer, step, operation, element, assembly and/or their group.Should be appreciated that, when we claim element to be " connected " or " coupling " to another element time, it can be directly connected or coupled to other elements, or also can there is intermediary element.In addition, " connection " used herein or " coupling " can comprise wireless connections or couple.Wording "and/or" used herein comprises one or more arbitrary unit listing item be associated and all combinations.
Those skilled in the art of the present technique are appreciated that unless otherwise defined, and all terms used herein (comprising technical term and scientific terminology) have the meaning identical with the general understanding of the those of ordinary skill in field belonging to the present invention.Should also be understood that those terms defined in such as general dictionary should be understood to have the meaning consistent with the meaning in the context of prior art, unless and define as here, can not explain by idealized or too formal implication.
The embodiment of the present invention provides a kind of sound analysis method, as shown in Figure 1, comprising:
101, by the voice signal gathered, obtain according to gathered voice signal the audio file distinguishing corresponding different compression algorithms with bit rate with identical sampling rate by different compression algorithms.
In the present embodiment, the concrete execution flow process of sound analysis method based on the device of framework as shown in Figure 2, can specifically choose STM32 enhancement mode series F103VET6 as system master solution; Memory module comprises CH376U disk storage circuit, SD card memory module; Compression algorithm realizes module and comprises MP3, AMR, AAC, WMA tetra-kinds of audio compression algorithm modules.Coordination voice recording-reproducing assembly, memory module, TFT (ThinFilmTransistor is thin film transistor (TFT)) touch screen module, compression algorithm realize other interfaces such as module, serial ports.Voice recording-reproducing assembly comprises ISD4004 module, LM386 power amplifier, filtered bias module.
After device powers on, one section of voice can be enrolled, terminate recording by stop key, and through different four kinds of compression algorithms, then four sections of voice of recorded identical sampling rate and bit rate are deposited in USB flash disk or SD card.Wherein, the microSD card that SD card adopts, adopts SDIO (SecureDigitalInputandOutputCard, secure digital input-output card) mode to be connected with STM32 main control module, maximum support 8GSD card; USB flash disk memory module take CH376T as core, adopts USBA type interface to connect USB flash disk, maximum support 8GU dish.Power supply is 5V power supply adaptor specifically, and 3.3V voltage is provided by AMS1117 chip.
102, from the audio file of the different compression algorithm of correspondence, extract unvoiced segments, and obtain phonetic feature signal according to extracted unvoiced segments.
Specifically can realize on host computer, first extract unvoiced segments, the extraction flow process of unvoiced segments as shown in Figure 3.
In the present embodiment, described compression algorithm comprises compression algorithms different in 4, comprises MP3, AMR, WMA and AAC.Describedly obtain phonetic feature signal according to extracted unvoiced segments, specifically comprise: improve MFCC (MelFrequencyCepstrumCoefficient by asking for as shown in Figure 4, Mel frequency cepstral coefficient) flow process of parameter, and for every section of unvoiced segments, adopt cepstrum coefficient method to extract 500 group 24 and tie up phonetic feature signal.
103, utilize described phonetic feature signal as training data training BP neural network, and by completing the BP analysis of neural network test signal of training, identify the sound pick-up outfit generating described test signal.
In the present embodiment, the structure of described BP neural network comprises: input layer arranges 24 nodes, and hidden layer arranges 25 nodes, and output layer arranges 4 nodes.
Such as: as shown in Figure 5.Build BP neural network by developing instrument MATLAB2014a, thus extract unvoiced segments by programming, in voice unvoiced segments, extract characteristic parameter, avoid the interference of voice signal, finally determine the model of cognition BP neural network of sound pick-up outfit recognition system.
Concrete, BP neural network builds the structure determining BP neural network according to system inputoutput data feature, because phonetic feature input signal has 24 dimensions, voice signal to be sorted has four classes, so the structure of BP neural network is 24-25-4 and input layer 24 nodes, hidden layer has 25 nodes, and output layer has 4 nodes.
In the training stage, BP neural metwork training training data training BP neural network, such as: have 2000 groups of phonetic feature signals, therefrom Stochastic choice 1500 groups of data are as training data training network, and 500 groups of data are as test data test network classification capacity.
Test phase after training, the BP neural network classification neural network trained is classified to voice class belonging to test data.Thus the overall procedure realized as shown in Figure 6, namely for the voice signal collected, obtain four sections of voice of different audio format, then after host computer is disposed, input one section of voice and can identify its audio format thus determine by which kind of sound pick-up outfit recorded.
The sound analysis method that the embodiment of the present invention provides, for adopting different compression algorithm with identical sampling rate and bit rate according to gathered voice signal, extract recording unvoiced segments also asks improvement respectively MFCC parameter to it, corresponding MFCC characteristic parameter is obtained by the audio file of different baud rate input Matlab, recycling MFCC characteristic parameter is trained BP neural network, with the BP neural network classification phonetic feature signal trained, according to classification results identification sound pick-up outfit, the equipment cost used due to the present invention such as STM32 and Matlab is cheap, therefore the accuracy rate of the source device identification improving audio file with lower cost is achieved.
Further, the embodiment of the present invention provides a kind of sound analysis device, as shown in Figure 7, comprising: the main system control module connected by bus each other, voice recording-reproducing assembly, TFT touch screen module, compression algorithm realize module, memory module and upper computer module.
Described voice recording-reproducing assembly, for the signal that plays sound.
Described compression algorithm realizes module, for obtaining according to gathered voice signal the audio file distinguishing corresponding different compression algorithms with bit rate with identical sampling rate by different compression algorithms.
Described memory module, for storing the audio file of the different compression algorithm of described correspondence.
Described upper computer module, for extracting unvoiced segments in the audio file from the different compression algorithm of correspondence, and obtains phonetic feature signal according to extracted unvoiced segments.And utilize described phonetic feature signal as training data training BP neural network, and by completing the BP analysis of neural network test signal of training, identify the sound pick-up outfit generating described test signal.
Concrete, described main system control module is STM32 enhancement mode F103VET6 chip, this chip is a 32 enhancement mode MCU, adopts the cortex-M3 kernel of ARM company, has 512KFlash, 64KRAM, 3 SPI mouths, SDIO mouth, 5 USART, is up to the dominant frequency of 72M.Described voice recording-reproducing assembly is ISD4004, perform audio frequency by LM386 integrated audio power amplifier to amplify, record length is set as 8-16 minute, due to collection synthesis will be carried out to voice signal, multi collect quantification can cause certain quantization error, adopts ISD4004 to record, by many level Directly multilevel storage technology, each sampled value is directly stored in sheet in flicker memory, therefore, it is possible to very truly, reproducing speech naturally.Wherein, audio frequency amplifies selects LM386 integrated audio power amplifier, and AMS1117-3.3 is selected in voltage stabilizing.Described upper computer module extracts unvoiced segments especially by MATLAB2014a from the audio file of the different compression algorithm of correspondence, and obtain phonetic feature signal according to extracted unvoiced segments, and utilize described phonetic feature signal as training data training BP neural network, and by completing the BP analysis of neural network test signal of training, identify the sound pick-up outfit generating described test signal.
The sound analysis device that the embodiment of the present invention provides, for adopting different compression algorithm with identical sampling rate and bit rate according to gathered voice signal, extract recording unvoiced segments also asks improvement respectively MFCC parameter to it, corresponding MFCC characteristic parameter is obtained by the audio file of different baud rate input Matlab, recycling MFCC characteristic parameter is trained BP neural network, with the BP neural network classification phonetic feature signal trained, according to classification results identification sound pick-up outfit, the equipment cost used due to the present invention such as STM32 and Matlab is cheap, therefore the accuracy rate of the source device identification improving audio file with lower cost is achieved.
Each embodiment in this instructions all adopts the mode of going forward one by one to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for apparatus embodiments, because it is substantially similar to embodiment of the method, so describe fairly simple, relevant part illustrates see the part of embodiment of the method.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; the change that can expect easily or replacement, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claim.
Claims (5)
1. a sound analysis method, is characterized in that, comprising:
By the voice signal gathered, obtain according to gathered voice signal the audio file distinguishing corresponding different compression algorithms with bit rate with identical sampling rate by different compression algorithms;
From the audio file of the different compression algorithm of correspondence, extract unvoiced segments, and obtain phonetic feature signal according to extracted unvoiced segments;
Utilize described phonetic feature signal as training data training BP neural network, and by completing the BP analysis of neural network test signal of training, identify the sound pick-up outfit generating described test signal.
2. method according to claim 1, is characterized in that, described compression algorithm comprises compression algorithms different in 4, comprises MP3, AMR, WMA and AAC;
Describedly obtain phonetic feature signal according to extracted unvoiced segments, comprising: for every section of unvoiced segments, adopt cepstrum coefficient method to extract 500 group 24 and tie up phonetic feature signal.
3. method according to claim 2, is characterized in that, the structure of described BP neural network comprises: input layer arranges 24 nodes, and hidden layer arranges 25 nodes, and output layer arranges 4 nodes.
4. a sound analysis device, is characterized in that, comprising: the main system control module connected by bus each other, voice recording-reproducing assembly, TFT touch screen module, compression algorithm realize module, memory module and upper computer module;
Described voice recording-reproducing assembly, for the signal that plays sound;
Described compression algorithm realizes module, for obtaining according to gathered voice signal the audio file distinguishing corresponding different compression algorithms with bit rate with identical sampling rate by different compression algorithms;
Described memory module, for storing the audio file of the different compression algorithm of described correspondence;
Described upper computer module, for extracting unvoiced segments in the audio file from the different compression algorithm of correspondence, and obtains phonetic feature signal according to extracted unvoiced segments; And utilize described phonetic feature signal as training data training BP neural network, and by completing the BP analysis of neural network test signal of training, identify the sound pick-up outfit generating described test signal.
5. method according to claim 4, is characterized in that, described main system control module is STM32 enhancement mode F103VET6 chip;
Described voice recording-reproducing assembly is ISD4004, performs audio frequency amplify by LM386 integrated audio power amplifier;
Described upper computer module extracts unvoiced segments especially by MATLAB2014a from the audio file of the different compression algorithm of correspondence, and obtain phonetic feature signal according to extracted unvoiced segments, and utilize described phonetic feature signal as training data training BP neural network, and by completing the BP analysis of neural network test signal of training, identify the sound pick-up outfit generating described test signal.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510819750.6A CN105513610A (en) | 2015-11-23 | 2015-11-23 | Voice analysis method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510819750.6A CN105513610A (en) | 2015-11-23 | 2015-11-23 | Voice analysis method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105513610A true CN105513610A (en) | 2016-04-20 |
Family
ID=55721536
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510819750.6A Pending CN105513610A (en) | 2015-11-23 | 2015-11-23 | Voice analysis method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105513610A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106231357A (en) * | 2016-08-31 | 2016-12-14 | 浙江华治数聚科技股份有限公司 | A kind of Forecasting Methodology of television broadcast media audio, video data chip time |
CN106331741A (en) * | 2016-08-31 | 2017-01-11 | 浙江华治数聚科技股份有限公司 | Television and broadcast media audio and video data compression method |
CN106997767A (en) * | 2017-03-24 | 2017-08-01 | 百度在线网络技术(北京)有限公司 | Method of speech processing and device based on artificial intelligence |
CN107516527A (en) * | 2016-06-17 | 2017-12-26 | 中兴通讯股份有限公司 | A kind of encoding and decoding speech method and terminal |
CN110728991A (en) * | 2019-09-06 | 2020-01-24 | 南京工程学院 | Improved recording equipment identification algorithm |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103325382A (en) * | 2013-06-07 | 2013-09-25 | 大连民族学院 | Method for automatically identifying Chinese national minority traditional instrument audio data |
WO2013149123A1 (en) * | 2012-03-30 | 2013-10-03 | The Ohio State University | Monaural speech filter |
CN103426438A (en) * | 2012-05-25 | 2013-12-04 | 洪荣昭 | Method and system for analyzing baby crying |
US20140019390A1 (en) * | 2012-07-13 | 2014-01-16 | Umami, Co. | Apparatus and method for audio fingerprinting |
CN104732977A (en) * | 2015-03-09 | 2015-06-24 | 广东外语外贸大学 | On-line spoken language pronunciation quality evaluation method and system |
-
2015
- 2015-11-23 CN CN201510819750.6A patent/CN105513610A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2013149123A1 (en) * | 2012-03-30 | 2013-10-03 | The Ohio State University | Monaural speech filter |
CN103426438A (en) * | 2012-05-25 | 2013-12-04 | 洪荣昭 | Method and system for analyzing baby crying |
US20140019390A1 (en) * | 2012-07-13 | 2014-01-16 | Umami, Co. | Apparatus and method for audio fingerprinting |
CN103325382A (en) * | 2013-06-07 | 2013-09-25 | 大连民族学院 | Method for automatically identifying Chinese national minority traditional instrument audio data |
CN104732977A (en) * | 2015-03-09 | 2015-06-24 | 广东外语外贸大学 | On-line spoken language pronunciation quality evaluation method and system |
Non-Patent Citations (1)
Title |
---|
贺前华等: "基于改进PNCC特征和两步区分性训练的录音设备识别方法", 《电子学报》 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107516527A (en) * | 2016-06-17 | 2017-12-26 | 中兴通讯股份有限公司 | A kind of encoding and decoding speech method and terminal |
CN106231357A (en) * | 2016-08-31 | 2016-12-14 | 浙江华治数聚科技股份有限公司 | A kind of Forecasting Methodology of television broadcast media audio, video data chip time |
CN106331741A (en) * | 2016-08-31 | 2017-01-11 | 浙江华治数聚科技股份有限公司 | Television and broadcast media audio and video data compression method |
CN106331741B (en) * | 2016-08-31 | 2019-03-08 | 徐州视达坦诚文化发展有限公司 | A kind of compression method of television broadcast media audio, video data |
CN106997767A (en) * | 2017-03-24 | 2017-08-01 | 百度在线网络技术(北京)有限公司 | Method of speech processing and device based on artificial intelligence |
CN110728991A (en) * | 2019-09-06 | 2020-01-24 | 南京工程学院 | Improved recording equipment identification algorithm |
CN110728991B (en) * | 2019-09-06 | 2022-03-01 | 南京工程学院 | Improved recording equipment identification algorithm |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105513610A (en) | Voice analysis method and device | |
US10522136B2 (en) | Method and device for training acoustic model, computer device and storage medium | |
CN103035247B (en) | Based on the method and device that voiceprint is operated to audio/video file | |
CN107393554B (en) | Feature extraction method for fusion inter-class standard deviation in sound scene classification | |
CN106887225A (en) | Acoustic feature extracting method, device and terminal device based on convolutional neural networks | |
Selvaperumal et al. | Speech to text synthesis from video automated subtitling using Levinson Durbin method of linear predictive coding | |
CN103500579B (en) | Audio recognition method, Apparatus and system | |
CN110189757A (en) | A kind of giant panda individual discrimination method, equipment and computer readable storage medium | |
CN101923857A (en) | Extensible audio recognition method based on man-machine interaction | |
CN102799899A (en) | Special audio event layered and generalized identification method based on SVM (Support Vector Machine) and GMM (Gaussian Mixture Model) | |
CN107591167B (en) | Method and system for realizing automatic detection of vehicle-mounted multimedia audio compatibility | |
CN114927125A (en) | Audio classification method and device, terminal equipment and storage medium | |
CN106528715A (en) | Audio content checking method and device | |
CN111462760B (en) | Voiceprint recognition system, voiceprint recognition method, voiceprint recognition device and electronic equipment | |
JP2005321530A (en) | Utterance identification system and method therefor | |
KR101382356B1 (en) | Apparatus for forgery detection of audio file | |
Hajihashemi et al. | Novel time-frequency based scheme for detecting sound events from sound background in audio segments | |
CN116013371A (en) | Neurodegenerative disease monitoring method, system, device and storage medium | |
Fersini et al. | Audio-based emotion recognition in judicial domain: A multilayer support vector machines approach | |
Rafi et al. | Exploring Classification of Vehicles Using Horn Sound Analysis: A Deep Learning-Based Approach | |
CN113691382A (en) | Conference recording method, conference recording device, computer equipment and medium | |
Fathan et al. | An Ensemble Approach for the Diagnosis of COVID-19 from Speech and Cough Sounds | |
JPH11231897A (en) | Speech recognition device and method | |
Li et al. | Fdn: Finite difference network with hierarchical convolutional features for text-independent speaker verification | |
Cairns et al. | Detection of hypernasal speech using a nonlinear operator |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160420 |
|
RJ01 | Rejection of invention patent application after publication |