KR101382356B1 - Apparatus for forgery detection of audio file - Google Patents
Apparatus for forgery detection of audio file Download PDFInfo
- Publication number
- KR101382356B1 KR101382356B1 KR1020130079293A KR20130079293A KR101382356B1 KR 101382356 B1 KR101382356 B1 KR 101382356B1 KR 1020130079293 A KR1020130079293 A KR 1020130079293A KR 20130079293 A KR20130079293 A KR 20130079293A KR 101382356 B1 KR101382356 B1 KR 101382356B1
- Authority
- KR
- South Korea
- Prior art keywords
- audio signal
- learning
- feature data
- model
- recording path
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
Description
An embodiment of the present invention relates to an apparatus for detecting whether or not an audio file is forged.
The contents described in this section merely provide background information on the embodiment of the present invention and do not constitute the prior art.
The popularity of digital devices such as digital recorders, MP3s, and smartphones has made it easier for anyone to record digital voice files. With the ease of secret recordings using digital devices, the number of cases in which voice files are submitted as evidence in court continues to increase.
Digital voice files are easy to forgery such as insertion, deletion, and copying, but it is not easy to detect forgery, so the expert judges whether the forgery is through precise listening analysis and device analysis. At this time, it takes a long time for the expert to analyze the forgery of the entire voice file. In this case, if you use a system that automatically detects suspected forgery, you can effectively reduce the time and cost required for file analysis.
The digital voice file contains sound signal information according to the characteristics of the recording device and ambient noise during recording, and has different characteristics depending on the compression technique or the storage technique. Automatic identification of these features can improve the accuracy of the analysis results while reducing the physical time required to analyze the evidence.
According to an embodiment of the present invention, by extracting feature data of an audio file, estimating a recording path used in the process of generating the audio file, and automatically detecting a suspected forgery interval of the audio file based on the recording path estimation result, the evidence data analysis is performed. The main object of the present invention is to provide a forgery suspecting interval detection device that can shorten the physical time required and improve the accuracy of the analysis result.
One embodiment of the present invention, the process of receiving an audio signal; Generating a plurality of frame unit signals by dividing the audio signal into frame units having a predetermined length; Extracting feature data from each of a plurality of frame unit signals; Generating an estimated recording path by estimating all or each recording path of a plurality of the frame unit signals; And detecting a specific section based on the estimated recording path.
The generating of the frame unit signal may allow a plurality of the frame unit signals to overlap at regular intervals.
In the generating of the estimated recording path, the recording path of the model having the highest similarity with the feature data may be estimated as all or respective recording paths of the corresponding frame unit signal.
The detecting of the singular section may include setting a recording path having the highest frequency among the estimated recording paths as a default path, and including a section including all or each of the frame unit signals having a recording path different from the default path. Can be detected.
The feature data may be generated by at least one of features such as Mel-Frequency Cepstral Coefficient (MFCC), Linear Prediction Coding (LPC) Coefficient, and Perceptual Linear Prediction (PLP) Coefficient.
The method for detecting a forgery suspect interval further includes adding a model, wherein adding the model includes: receiving path information of the model; Receiving a learning audio signal; Extracting learning feature data of the learning audio signal; And learning the feature data for learning.
The learning may use at least one of methods such as Gaussian Mixture Model (GMM), Hidden Markov Model (HMM), and Support Vector Machine (SVM).
The extracting of the feature data for learning may be performed using the same method as the method of extracting the feature data used in the process of extracting the feature data.
Further, according to another aspect of the embodiment of the present invention, the process of receiving the path information of the model; Receiving an audio signal for training the model; Extracting feature data of the audio signal; And it provides a model learning method comprising the step of learning the feature data.
The route information may include at least one of information such as a device used for recording, a sample frequency, a compression method, and a transmission method.
The feature data may be generated by at least one of features such as Mel-Frequency Cepstral Coefficient (MFCC), Linear Prediction Coding (LPC) Coefficient, and Perceptual Linear Prediction (PLP) Coefficient.
The learning may use at least one of methods such as Gaussian Mixture Model (GMM), Hidden Markov Model (HMM), and Support Vector Machine (SVM).
In addition, according to another aspect of an embodiment of the present invention, an audio signal receiving unit for receiving an audio signal; A frame dividing unit dividing the audio signal into frame units having a predetermined length to generate a plurality of frame unit signals; A feature data extraction unit for extracting feature data from each of a plurality of frame unit signals; A model storage unit for storing a model; A recording path estimator for estimating all or a plurality of recording paths of the plurality of frame unit signals using the feature data and the model to generate an estimated recording path; And a specific section comparison detection unit for detecting a specific section based on the estimated recording path.
The apparatus for detecting a forgery suspect period further includes a model adding unit, wherein the model adding unit includes: a learning feature data extracting unit for receiving a learning audio signal and extracting learning feature data from the learning audio signal; And a model learning unit configured to receive the training feature data and the path information of the model, train the model, generate the learning result as an additional model, and transmit the additional model to the model storage unit.
The learning feature data extracting unit may use the same method as the method of extracting the feature data used in the feature data extracting unit as a method of extracting the learning feature data.
In addition, according to another aspect of an embodiment of the present invention, an audio signal receiving unit for receiving an audio signal; A feature data extraction unit for extracting feature data from the audio signal; And a model learner which receives the feature data and the path information of the audio signal, learns the feature data, and generates the learning result as a model.
According to an embodiment of the present invention, by automatically detecting a section suspected of forgery of the audio file, it is possible to shorten the physical time required to analyze the audio file, and to reduce the cost due to the input of expert personnel, Accuracy can also be improved.
1 is a view schematically showing the configuration of a forgery suspected interval detection device according to an embodiment of the present invention.
2 is a flowchart illustrating an operation of a singular section detecting unit in a forgery suspected section detecting apparatus according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating in detail a process of generating a frame unit signal in the apparatus for detecting a suspected forgery interval according to an embodiment of the present invention.
4 is a diagram illustrating in detail a process of extracting feature data in the apparatus for detecting a suspected forgery interval according to an embodiment of the present invention.
5 is a flowchart illustrating the operation of the model adder in the forgery suspected interval detection device according to an embodiment of the present invention.
6 is a block diagram of a forgery suspected interval detection device according to an embodiment of the present invention.
7 is a view showing in detail the model addition unit in the forgery suspected interval detection device according to an embodiment of the present invention.
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.
It should be noted that, in adding reference numerals to the constituent elements of the drawings, the same constituent elements are denoted by the same reference symbols as possible even if they are shown in different drawings. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.
In describing the components of the present invention, terms such as first, second, A, B, (a), and (b) may be used. These terms are intended to distinguish the constituent elements from other constituent elements, and the terms do not limit the nature, order or order of the constituent elements. When a component is described as being "connected", "coupled", or "connected" to another component, the component may be directly connected to or connected to the other component, It should be understood that an element may be "connected," "coupled," or "connected."
1 is a view schematically showing the configuration of the forgery suspected
Forgery suspected
Singular
The
2 is a flowchart illustrating an operation of the singular
The singular
The method for detecting the singular section by the singular
The audio signal to be detected for the specific section, that is, the specific section detection target audio signal may be a digital or analog signal conforming to the uncompressed format such as WAV, AIFF and AU and the compressed format such as MP3, WMA, MPC and OGG.
Thereafter, the received audio signal is divided into frame units having a predetermined length to generate a frame unit signal (S220). 3 is a diagram illustrating a process of generating a frame unit signal (S220) in detail.
In order to compensate for the high frequency components, the audio signal of the audio signal mainly has a high pass characteristic, as shown in FIG. 3, in order to compensate for the high frequency components. By performing an emphasis (Pre-Emphasis) process, the received audio signal is processed to have a uniform energy distribution over the entire band.
The pre-emphasis audio signal is divided into frames. A window function such as rectangular, hamming, hanning, and kaiser-bessel may be used to divide the audio signal in units of frames.
At this time, it is preferable to set the length of the frame to a very short time (tens of hundreds to hundreds of msec) that can be assumed to be stationary without changing the characteristics of the audio signal. In the present embodiment, the length of the frame is set to be 3 to 4 times the pitch period, but is not necessarily limited thereto.
When the analysis is divided into frame units as described above, in order to ensure continuity between analysis units, the divided sections are set to overlap to some extent. For example, if the length of the frame is set to 20 msec and the moving period of the frame is set to 10 msec, a duplicate portion of 10 msec occurs between adjacent frames.
Feature data is extracted from the audio signal divided into the frame unit signal (S230). Features include Mel-Frequency Cepstral Coefficients (MFCC), Linear Prediction Coding (LPC) Coefficient, Perceptual Linear Prediction (PLP) Coefficient, Total Power Spectrum, Subband Power, and Center Frequency Centroid), bandwidth and pitch frequency may be used.
In describing the operation of the forgery suspected
4 is a diagram illustrating a process of extracting MFCC feature data (S230) in detail.
The frame unit signal is converted into a frequency band by a fast fourier transform (FFT), and the signal converted into the frequency band passes through a Mel-Scale Filter Bank.
Mel-Scale Filter Bank is composed of filter banks of a plurality of Band-pass Filters close to the human auditory structure, and the center frequency array of each filter is 1 KHz based on Mel unit, which is a human perceptual frequency unit. In the following, it is configured linearly, and in 1 KHz or more, it is configured in a log scale. Processing by the Mel-Scale Filter Bank makes the incoming audio signal similar to the spectral signal perceived by the human auditory system.
The signal passing through the Mel-Scale Filter Bank is processed through a log process to obtain MFCC characteristic data by a discrete cosine transform.
Next, an estimated recording path is generated by estimating the recording path of each frame unit signal (S240). At this time, there should be a model trained on the recording path. That is, the recording path of the corresponding frame is estimated by learning the feature data extracted from the audio signal of which the path information is known, by comparing the extracted model data with the feature data extracted from the detection target audio signal.
If there is no model, the
The estimated recording path is generated by estimating the recording path of the frame with the model with the highest similarity by comparing the feature data of each frame with the learned model.
The path information on the recording path means information that can affect the formation of the characteristics of the audio signal during the generation of the audio signal. The path information on the recording path, the sampling frequency, the compression method, the transmission method, etc. There may be, and such information may be combined to form route information.
Information on the equipment used for recording includes the types of devices such as mini recorders, small ballpoint pen recorders, calculator recorders, mobile phones, video cameras, cassette tape recorders, and digital audio tapes (DAT), and the model name and manufacturing serial number of each device ( Serial Number) may be included.
Sampling frequency is 44.1 KHz sampling frequency for home devices such as CD (Compact Disc), MD (Mini Disc) and LD (Laser Disc), and sampling frequency of 48.0 KHz is used for broadcasting equipment such as DAT. The instrument uses a sampling frequency much higher than the Nyquist frequency, such as 96 KHz, to improve the signal-to-noise ratio.
Compression methods include MPEG-1 Layer III (MP3), Advanced Audio Coding (AAC), High-Efficiency Advanced Audio Coding (HE-AAC), Dolby Digital (AC3), Adaptive Multi-Rate (AMR), and Adaptive TRransform (ATRAC). Compression methods such as Acoustic Coding (ADC), Adaptive Differential Pulse Coded Modulation (ADPCM), and Free Lossless Audio Codec (FLAC) may be used.
Transmission methods include AES / EBU (Audio Engineering Society / European Broadcasting Union), S / PDIF (Sony / Philips Digital Interface Format), ADI (Alesis Digital Interface), TDIF (Tascam Digital InterFace) and USB (Universal Serial Bus) Manner may be used.
Depending on the device used, the sampling frequency, compression method, and transmission method used in the device may be determined. This can be decided
When the forgery suspected
Thereafter, the specific section is detected based on the estimated recording path for each frame (S250).
Among the estimated recording paths, an estimated recording path having the highest frequency is selected as the basic path, and a section including a frame having an estimated recording path different from the basic path is detected as a specific section.
5 is a flowchart illustrating the operation of the
The method of adding the model by the
First, path information of a model to be added to the singular
Thereafter, an audio signal to be used for learning the recording path model is received (S520). In this case, the audio signal to be used for learning the recording path model should be an audio signal generated by the path information received in step S510 of receiving path information of the model to be learned.
Then, feature data of the received audio signal is extracted (S530). At this time, the method of extracting the feature data should be the same method as the method of extracting the feature data used in the step (S230) of extracting the feature data from each frame unit signal of the singular
Next, the model of the recording path is trained using the extracted feature data (S540). As a learning method, a Gaussian Mixture Model (GMM), a Hidden Markov Model (HMM), and a Support Vector Machine (SVM) may be used.
When using the apparatus for detecting the suspected
6 is a block diagram of the
The forgery suspected
The detection
The
In this case, in order to ensure continuity between analysis units, the divided sections are divided to overlap each other. For example, if the length of the frame is set to 20 msec and the moving period of the frame is set to 10 msec, a duplicate portion of 10 msec occurs between adjacent frames.
The
In describing the operation of the forgery suspected
The frame unit signal is converted into a frequency band by a fast fourier transform (FFT), and the signal converted into the frequency band passes through a Mel-Scale Filter Bank. The signal passing through the Mel-Scale Filter Bank undergoes a logarithmic transformation and obtains MFCC feature data by a discrete cosine transform.
The
The path information of the recording path means information that may affect the formation of the characteristics of the audio signal during the generation of the audio signal. The path information of the recording path, the sampling frequency, the encoding method, the compression method, and the transmission method are used. There may be a method and the like and such information may be combined to form route information. Details thereof will be omitted since they have been described above.
The singular section
7 is a view showing in detail the
The
The training
The training
The
The apparatus for detecting suspected
While the present invention has been described in connection with what is presently considered to be the most practical and preferred embodiments, it is to be understood that the invention is not limited to the disclosed embodiments. That is, within the scope of the present invention, all of the components may be selectively coupled to one or more of them.
The above description is merely illustrative of the technical idea of the present invention, and those skilled in the art to which the present invention pertains may make various modifications and changes without departing from the essential characteristics of the present invention. Therefore, the embodiments disclosed in the present invention are not intended to limit the technical idea of the present invention but to describe the present invention, and the scope of the technical idea of the present invention is not limited by these embodiments. The scope of protection of the present invention should be interpreted by the following claims, and all technical ideas within the scope equivalent thereto should be construed as being included in the scope of the present invention.
100: forgery suspected section detection device
110: singular section detection unit 120: model addition unit
610: detection audio signal receiving unit 620: frame division unit
630: feature data extraction unit 640: recording path estimation unit
650: model storage unit 660: specific section comparison detection unit
710: learning audio signal receiving unit 720: learning feature data extraction unit
730: model learning unit
Claims (16)
Generating a plurality of frame unit signals by dividing the audio signal into frame units having a predetermined length;
Extracting, as feature data, one of a Mel-Frequency Cepstral Coefficient (MFCC), a Linear Prediction Coding (LPC) coefficient, and a Perceptual Linear Prediction (PLP) coefficient from each of the plurality of frame unit signals;
Setting a model of the audio signal recording path having the highest similarity with the feature data among the models of the plurality of audio signal recording paths as the estimated recording path of each frame unit signal; And
Setting the most frequently estimated recording path among the estimated recording paths as a basic recording path, and detecting a section including the frame unit signal having an estimated recording path different from the basic recording path as a singular section suspected of forgery;
Forgery suspected interval detection method comprising a.
The generating of the frame unit signal may include a plurality of frame unit signals overlapping at regular intervals.
The method for detecting a forgery suspect interval further includes adding a model for the audio signal recording path,
Adding a model for the audio signal recording path,
Receiving a learning audio signal;
Receiving at least one of path information including a sampling frequency, a compression method, a device used for recording, and a transmission method for the learning audio signal as a learning audio signal recording path;
Extracting, from the learning audio signal, one of a Mel-Frequency Cepstral Coefficient (MFCC), a Linear Prediction Coding (LPC) coefficient, and a Perceptual Linear Prediction (PLP) coefficient as learning feature data; And
Learning a model for the learning audio signal recording path using the training feature data in at least one of a learning method including a Gaussian Mixture Model (GMM), a Hidden Markov Model (HMM), and a Support Vector Machine (SVM). Forgery suspected interval detection method characterized in that it comprises a process for.
And extracting the learning feature data as the learning feature data using the same method as the method used in the extracting the feature data.
Extracting, from the learning audio signal, one of a Mel-Frequency Cepstral Coefficient (MFCC), a Linear Prediction Coding (LPC) coefficient, and a Perceptual Linear Prediction (PLP) coefficient as feature data; And
Learning the recording path information using the feature data
Model learning method comprising a.
The recording path information, the model learning method, characterized in that it comprises at least one of the device used for recording, sample frequency, compression method and transmission method.
A frame dividing unit dividing the audio signal into frame units having a predetermined length to generate a plurality of frame unit signals;
A feature data extraction unit for extracting one of the plurality of frame unit signals from the plurality of frame unit signals as feature data from among MFCC (Linear Prediction Coding) coefficients and LLP (Perceptual Linear Prediction) coefficients;
A model storage unit for storing at least one model for the audio signal recording path;
A recording path estimator configured to set a model of the audio signal recording path having the highest similarity with the feature data among the models of the audio signal recording path as the estimated recording path of each of the frame unit signals; And
Singularity that sets the most frequently estimated recording path among the estimated recording paths as a basic recording path, and detects a section including the frame unit signal having an estimated recording path different from the basic recording path as a singular section suspected of forgery. Section comparison detector
Forgery suspected interval detection device comprising a.
The forgery suspected interval detection device further includes a model addition unit,
The model adder,
A learning feature data extracting unit for receiving a learning audio signal and extracting one of a mel-frequency cepstral coefficient (MFCC) coefficient, a linear prediction coding (LPC) coefficient, and a perceptual linear prediction coefficient (PLP) coefficient from the learning audio signal as learning feature data. ; And
Receiving path information of the learning audio signal, learning the path information using the learning feature data, and additionally generating a model for an audio signal recording path based on the learning result and transmitting it to the model storage unit. Model Learning Department
, ≪ / RTI &
And wherein the learning is performed by at least one of a learning method including a Gaussian Mixture Model (GMM), a Hidden Markov Model (HMM), and a Support Vector Machine (SVM).
And the learning feature data extracting unit extracts the learning feature data using the same method as the method of extracting the feature data used by the feature data extracting unit.
A feature data extraction unit for extracting, from the audio signal, one of a Mel-Frequency Cepstral Coefficient (MFCC), a Linear Prediction Coding (LPC) coefficient, and a Perceptual Linear Prediction (PLP) coefficient as feature data; And
Model learning unit for learning the recording path information using the feature data, and generates a model for the audio signal recording path based on the result of the learning
Model learning apparatus comprising a.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020130079293A KR101382356B1 (en) | 2013-07-05 | 2013-07-05 | Apparatus for forgery detection of audio file |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020130079293A KR101382356B1 (en) | 2013-07-05 | 2013-07-05 | Apparatus for forgery detection of audio file |
Publications (1)
Publication Number | Publication Date |
---|---|
KR101382356B1 true KR101382356B1 (en) | 2014-04-10 |
Family
ID=50656863
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020130079293A KR101382356B1 (en) | 2013-07-05 | 2013-07-05 | Apparatus for forgery detection of audio file |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101382356B1 (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101566425B1 (en) | 2014-07-23 | 2015-11-06 | 대한민국 | Detecting Method of Suspicious Points of Editing through Analysis of Frequency Distribution |
CN112397102A (en) * | 2019-08-14 | 2021-02-23 | 腾讯科技(深圳)有限公司 | Audio processing method and device and terminal |
CN113409771A (en) * | 2021-05-25 | 2021-09-17 | 合肥讯飞数码科技有限公司 | Detection method for forged audio frequency, detection system and storage medium thereof |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6973573B1 (en) | 2000-02-23 | 2005-12-06 | Doug Carson & Associates, Inc. | Detection of a digital data fingerprint |
KR20080098878A (en) * | 2007-05-07 | 2008-11-12 | (주)엔써즈 | Method and apparatus for generating audio fingerprint data and comparing audio data using the same |
US20090031425A1 (en) | 2007-07-27 | 2009-01-29 | International Business Machines Corporation | Methods, systems, and computer program products for detecting alteration of audio or image data |
JP2009070026A (en) | 2007-09-12 | 2009-04-02 | Mitsubishi Electric Corp | Recording device, verification device, reproduction device, recording method, verification method, and program |
-
2013
- 2013-07-05 KR KR1020130079293A patent/KR101382356B1/en active IP Right Grant
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6973573B1 (en) | 2000-02-23 | 2005-12-06 | Doug Carson & Associates, Inc. | Detection of a digital data fingerprint |
KR20080098878A (en) * | 2007-05-07 | 2008-11-12 | (주)엔써즈 | Method and apparatus for generating audio fingerprint data and comparing audio data using the same |
US20090031425A1 (en) | 2007-07-27 | 2009-01-29 | International Business Machines Corporation | Methods, systems, and computer program products for detecting alteration of audio or image data |
JP2009070026A (en) | 2007-09-12 | 2009-04-02 | Mitsubishi Electric Corp | Recording device, verification device, reproduction device, recording method, verification method, and program |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101566425B1 (en) | 2014-07-23 | 2015-11-06 | 대한민국 | Detecting Method of Suspicious Points of Editing through Analysis of Frequency Distribution |
CN112397102A (en) * | 2019-08-14 | 2021-02-23 | 腾讯科技(深圳)有限公司 | Audio processing method and device and terminal |
CN113409771A (en) * | 2021-05-25 | 2021-09-17 | 合肥讯飞数码科技有限公司 | Detection method for forged audio frequency, detection system and storage medium thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP5362178B2 (en) | Extracting and matching characteristic fingerprints from audio signals | |
Dean et al. | The QUT-NOISE-TIMIT corpus for evaluation of voice activity detection algorithms | |
Cuccovillo et al. | Audio tampering detection via microphone classification | |
WO2019148586A1 (en) | Method and device for speaker recognition during multi-person speech | |
CN110189757A (en) | A kind of giant panda individual discrimination method, equipment and computer readable storage medium | |
US20200265864A1 (en) | Segmentation-based feature extraction for acoustic scene classification | |
CN102486920A (en) | Audio event detection method and device | |
CN111640411B (en) | Audio synthesis method, device and computer readable storage medium | |
Zhao et al. | Audio splicing detection and localization using environmental signature | |
US9058384B2 (en) | System and method for identification of highly-variable vocalizations | |
US9792898B2 (en) | Concurrent segmentation of multiple similar vocalizations | |
CN105825857A (en) | Voiceprint-recognition-based method for assisting deaf patient in determining sound type | |
WO2015092492A1 (en) | Audio information processing | |
CN111782861A (en) | Noise detection method and device and storage medium | |
CN105719660A (en) | Voice tampering positioning detection method based on quantitative characteristic | |
KR101382356B1 (en) | Apparatus for forgery detection of audio file | |
KR101808810B1 (en) | Method and apparatus for detecting speech/non-speech section | |
JP4985134B2 (en) | Scene classification device | |
Patil et al. | Combining evidences from mel cepstral features and cepstral mean subtracted features for singer identification | |
Marković et al. | Reverberation-based feature extraction for acoustic scene classification | |
Zhang et al. | Deep scattering spectra with deep neural networks for acoustic scene classification tasks | |
CN104715756A (en) | Audio data processing method and device | |
CN111627426B (en) | Method and system for eliminating channel difference in voice interaction, electronic equipment and medium | |
CN112750458B (en) | Touch screen sound detection method and device | |
Patole et al. | Acoustic environment identification using blind de-reverberation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
A302 | Request for accelerated examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |