CN103730128A - Audio clip authentication method based on frequency spectrum SIFT feature descriptor - Google Patents
Audio clip authentication method based on frequency spectrum SIFT feature descriptor Download PDFInfo
- Publication number
- CN103730128A CN103730128A CN201210389030.7A CN201210389030A CN103730128A CN 103730128 A CN103730128 A CN 103730128A CN 201210389030 A CN201210389030 A CN 201210389030A CN 103730128 A CN103730128 A CN 103730128A
- Authority
- CN
- China
- Prior art keywords
- audio
- frequency
- delta
- sift
- suspicious
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Landscapes
- Signal Processing For Digital Recording And Reproducing (AREA)
Abstract
The invention belongs to the technical field of information safety and protection, and relates to an audio clip authentication method based on a frequency spectrum SIFT feature descriptor, in particular to a content authentication method which is based on the computer vision technology and can use audio clips as detection objects. The SIFT feature matching is utilized for extracting the feature descriptor, and the detected suspicious audio clips are aligned in a reference audio file; the suspicious audio clips are divided into a plurality of blocks through contraction-expansion factors extracted from an SIFT key point; the time domain blocks can be directly used for recognizing malicious clipping, malicious insertion and other actions; modified factors are estimated and used for describing corresponding time domain units, so that robust Hash is calculated conveniently. Due to matching and Hash detection, not only can the integrity and authenticity of the suspicious audio clips be authenticated accurately, but also the positions of malicious tamper operations are accurately locked and classified according to types.
Description
Technical field
The invention belongs to information security and the resist technology field of audio authentication aspect; relate to a kind of audio fragment authentication method based on frequency spectrum SIFT Feature Descriptor, be specifically related to a kind of based on content authentication method computer vision technique, that the audio fragment of can take is detected object.
Background technology
The audio content authentication techniques effective technology that to be realizations detect and protect for integrality and the authenticity of the voice datas such as music and voice; its object is mainly that the data that the reciever of assurance audio transmission obtains are not suffered third-party malice editor and distort in transport process, the angle of human perception system, is identical with original audio.Different from traditional signatures authentication, the multimedia authentications such as audio frequency are guaranteed is the authentication of file content but not protects simply bit stream.Audio authentication is recorded in national security, trade secret, news at present, music records distribution and there is important application in many fields such as copyright protection, military communication.
Up to now, the research of audio authentication only has a small amount of method to deliver, and is summarized as follows.
Document [1] has proposed a kind of half fragile voice digital watermark for inspection content integrality, i.e. exponential odd even modulation technique.The method, at DFT territory embed watermark, does not need excessive data to assist its completeness check, and can distinguish malice and distort and keep content operation, but the method only to resampling, white noise pollutes and the minority such as voice coding allows that operation tests.
Document [2] proposes a kind of authentication method based on feature according to following principle, and two similar its masking curves of audio frequency of acoustical quality are also highly similar.First by calculating the Hash functional value of audio masking curve, then adopt known data-hiding method that it is embedded in sound signal as watermark.When detecting, by after watermark extracting with the hash value comparison calculating before, calculate its related coefficient.Because this coefficient declines along with the decline of audio frequency acoustical quality also has appropriateness, thereby can judgement thresholding be suitably set according to receptible acoustical quality standard.The method can distinguish the Audio Signal Processing such as MP3 and malice is distorted operation.
Document [3] has been introduced two kinds of methods for audio content authentication.The first has been discussed possible audio frequency characteristics, to allow several follow-up signals to process; The second, for obtaining highest security, detects the change of each bit, and carrys out reconstruct original audio by introducing the concept of reversible water mark; The method and then again in conjunction with digital signature and digital watermarking, and use key to produce the method that can openly verify and can rebuild original audio.
The method that document [4] proposes is based on audio-frequency fingerprint, and the method that employing robust hashing function is combined with robust watermarking is verified the integrality of audio file.Experiment is mainly tested for MP3 compression artefacts, when higher sampling bits rate is as above in 128kbps, can reach the vision response test below 7%, but compared with low sampling rate during as 32kbps vision response test about 40% left and right.
In document [5] and document [6], mentioned distributed source compression for the application that keeps audio frequency and video quality, detection of malicious attack operation.Document utilizes reference audio in [5], has the audio authentication of robustness by Slepian-Wolf Code And Decode process, yet has supposed that in advance audio frequency to be verified and original reference audio frequency align.Document has been used the hash signature compacting in [6], and the storage space of reference database is not reduced to 20%-70% not etc.
Above audio authentication method is all to carry out on the audio frequency to be certified basis identical with reference audio length, but the frequent just fragment of audio frequency to be certified, so the present invention in actual applications intends providing a kind of studied brand new technical first: audio fragment content authentication.Audio fragment authentication, on the basis of conventional audio authentication, can be compared with the reference audio of raw footage with a bit of audio frequency and mate behind location, adopts Hash or watermarking algorithm to obtain authentication result.
Reference related to the present invention has:
[1]C.P.Wu?and?C.C.Kuo,“Fragile?speech?watermarking?based?on?exponential?scalequantization?for?tamper?detection,”ICASSP?2002,pp.3305-3308.
[2]R.Radhakrishnan?and?N.Memon,“Audio?content?authentication?based?onpsycho-acoustic?model,”SPIE?Security?and?Watermarking?of?Multimedia?Contents,4675:110-117,2002.
[3]M.Steinebach?and?J.Dittmann,“Watermarking-based?digital?audio?dataauthentication,”EURASIP?JournalonApplied?Signal?Processing,10:1001-1015,2003.
[4]S.Zmudzinski?and?M.Steinebach,“Perception-based?audio?authenticationwatermarking?in?the?time-frequency?domain,”IH2009,pp.146-160.
[5]D.Varodayan,Y.?C.Lin?and?B.Girod,“Audio?authentication?based?on?distributedsource?coding,”ICASSP?2008,pp.225-228.
[6]G.?Valenzise,G.?Prandi,M.Tagliasacchi?and?A.Sarti,“Identification?of?sparse?audiotampering?using?distributed?source?coding?and?compressive?sensing?techniques,”EURASIP?JournalonImage?and?Video?Processing,2009:1-12.
Summary of the invention
The object of the invention is to propose the new audio content authentication method of information protection and field of authentication.Be specifically related to a kind of audio fragment authentication method based on frequency spectrum SIFT Feature Descriptor, especially a kind of based on content authentication method computer vision technique, that the audio fragment of can take is detected object.
The audio content authentication method that the present invention proposes is the method based on computer vision technique.The present invention solves is audio fragment integrality in audio authentication and the test problems of authenticity.The present invention utilizes SIFT (Scale Invariant FeatureTransform) characteristic matching to extract Feature Descriptor, and realizes the alignment of examined suspicious audio fragment in reference audio file; The time contraction-expansion factor (Time-Stretching Factor) that recycling is extracted from SIFT key point, is divided into a plurality of piecemeals by suspicious audio fragment.Time domain partitioning can be directly used in that identification malice is sheared and the behavior such as insertion; In addition for the estimation of the modified tone factor (pitch-shiftingfactor), also can be used for describing corresponding time frequency unit, thereby be convenient to robust Hash, calculate; By coupling and Hash, detect integrality and authenticity that not only can the suspicious audio-frequency fragments of precise Identification, and the accurately locking classifying by type of the position that can distort operation to malice.
Audio authentication method of the present invention.Compare with the authentication method of the traditional complete audio frequency to be measured of needs, the present invention only needs an audio fragment to be measured can carry out to it judgement of authenticity and integrity, the requirement of more realistic application.Fragment alignment (step 1 ~ step 4), robust hashing value (Hash) that the method is divided into based on sound spectrograph SIFT local description are calculated (step 5 ~ step 6) and authentication decision (step 7) three parts, and concrete steps are as follows:
Step 2, calculated characteristics descriptor
Calculate respectively 128 dimension SIFT Feature Descriptors of suspicious sound signal and reference audio signal, and obtain crucial match point (Matched SIFT Key Point) by comparing two groups of descriptors.If coupling logarithm is N, be designated as
and then coupling centering suspicious audio frequency high order end, low order end match point and reference audio high order end, low order end match point can be expressed as,
Step 3, the length of establishing suspicious audio fragment is L
d, the length of reference audio is L
r, the distance of audio fragment left margin and leftmost side SIFT unique point is
the distance of right margin and rightmost side SIFT unique point is
thereby the mapping distance in corresponding reference audio
with
available formula (1) calculates,
Wherein
Step 4, by the ascending order arrangement in chronological order of SIFT key point, thereby formula (2) compute location can be used in the position of suspicious audio fragment in reference audio,
L wherein
r, SR
r, P
rrespectively frame number, sampling rate and the time span (second) of reference audio;
Step 5, for the pre-service of audio frequency attack (time-scaling, modified tone, time domain shearing and insertion etc.)
(1) time, domain partitioning is processed: respectively suspicious audio frequency and reference audio are divided into a plurality of fritters, make every interior time domain zoom factor (time-stretching factor) approximately equal, thereby be convenient to find that fragment and robust Hash afterwards that malice is sheared or inserted calculate.
Concrete processing is as follows:
For SIFT key point pair
the distance L of two consecutive point in time domain
(D, R}with corresponding time contraction-expansion factor R
ibe defined as follows:
Then will gather { R
ibe divided into several subsets { C
i}:
every two continuous subsets, as C
i={ R
j..., R
kand C
i+1={ R
k+1..., R
i, both contraction-expansion factors averaging time have larger gap.
According to
be divided into the SIFT key point of each subset, utilize the fragment match method in step 2, suspicious audio fragment and reference audio fragment are divided into each corresponding piecemeal to { B
1, B
2..., B
m, as shown in Figure 2.
Fig. 2 illustrated about time domain partitioning details.Suppose only through the signal such as irrelevant sequential such as lossy compression method, to process before suspicious audio frequency the free contraction-expansion factor R of institute obtaining
iapproximately equal (≈ 1), and suspicious audio frequency and the reference audio signal corresponding with it all only have a time block; Suppose suspicious audio frequency flexible processing, all R of waiting of elapsed time before
iapproximately equal (> 1 or < 1) still, and respectively there is a piecemeal on both sides; / 3rd sections of centres supposing suspicious audio frequency are cut off, and are cut so the R on part both sides
ivalue will diminish, thereby form a concave point, and in this case, two section audios all can be divided into three piecemeals, the impact that still can not sheared of two sections of left and right, mutually alignment.
(2) frequency alignment: mate calculating the modified tone factor by SIFT key point, thereby calculate the corresponding relation of frequency between suspicious audible spectrum and reference audio frequency spectrum;
Concrete processing is as follows:
Owing to modifying tone, attack, the frequency values of suspicious audio frequency also can change in proportion from the initial value of reference audio; So in order to describe frequency content, must first estimate the modified tone factor (Pitch-shifting Factor).
Coupling for one group of SIFT key point is right
their frequency content can be expressed as
thereby the modified tone factor can be obtained by following formula,
Wherein,
the fundamental frequency of the ratio of reference audio and suspicious audio frequency
concrete corresponding relation as shown in Figure 3.
Step 6, robust Hash calculates
Adopt Philips algorithm to carry out the calculation of Hash yardage.Each is organized to corresponding piecemeal pair
its fragment length
and frequency range
first use formula (5) to adjust,
Wherein,
with
represent respectively corresponding piecemeal
zoom factor and average modified tone factor averaging time;
Make E
{ D, R}(k, n) represents that in frequency spectrum, being positioned at k organizes frequency range, the energy of n time frame.By frequency band
be divided into 33 nonoverlapping sub-bands, the 32 bit Hash codes in each region can use formula (6) to calculate,
Step 7, revises type detection
(1) malice shearing/Insert Fragment: by detection time zoom factor curve whether have concave point or salient point, judge whether audio file suffers the modification that malice is sheared or inserted; Wherein, if detection time, zoom factor was about 1 or fixed constant, illustrate that sound signal only experiences that the signal irrelevant with sequential processed or the whole time is flexible; If there is malice to shear/insert, on the curve of time contraction-expansion factor, corresponding position there will be concave point/salient point, as shown in Figure 4, Figure 5;
(2) malice frequency modification: constructing respectively according to SIFT key point can suspect signal and the histogram of reference signal, relatively judges from histogram whether apocrypha suffers malice frequency modification; For example, it is the behavior of a kind of typical malice frequency modification that bandwidth is blocked, spectrum position corresponding to region being modified, the SIFT key point of mating with reference audio can obviously reduce, accordingly, in the present invention, make the key point of SIFT coupling about the histogram of frequency, relatively determined whether that such distorts generation.As shown in Figure 6, transverse axis is that frequency 100 ~ 3000Hz is divided into 30 sections, and the longitudinal axis represents the number for the SIFT key point of frequency band match; Two figure are more known in left and right, and in 800 ~ 900Hz frequency range, right figure does not almost have match point, is very likely subject to frequency and distorts in known this section of frequency band.
(3) content modification: utilize bit error rate (BER) to judge whether file suffers the content modification of malice, and formula has defined threshold value T and decision rule in (7),
If BER≤T, authentication is passed through, and represents that examined audio frequency does not suffer maliciously to distort, and the content integrity of file is good; If BER > is T, authentification failure, represents that audio frequency is maliciously tampered.In one embodiment of the present of invention, compare one section of given suspicious audio-frequency fragments x
d() and one section of longer reference audio x
r(), detects x
dwhether () is subject to all kinds of attacks and distorts, and for the shearing/insertion of time domain and the bandwidth of frequency domain, blocks this two kinds of operations, judges respectively with time domain contraction-expansion factor and the crucial match point of SIFT about the histogram of time; If both all do not detect, x and then relatively
r(), x
r() robust hash value: pass through suspicious audio frequency x if BER (bit error rate) lower than threshold value, shows authentication
r() do not suffer the operation of distorting semantically, may be that the contents such as lossy compression method, TSM and modified tone keep operation; If BER, higher than threshold value, shows authentification failure.
Accompanying drawing explanation
Fig. 1: the SIFT key point coupling between suspicious audio fragment frequency spectrum and reference audio file frequency spectrum.
Fig. 2: while there is shearing manipulation in suspicious audio frequency, the time domain partitioning based on time contraction-expansion factor detects explanation.
Fig. 3: the frequency alignment between suspicious audio frequency and reference audio piecemeal frequency spectrum.
Fig. 4: shear the time contraction-expansion factor curvilinear motion causing.
Fig. 5: insert the time contraction-expansion factor curvilinear motion causing.
Fig. 6: the histogram of the SIFT key point number vs. frequency band of coupling before and after frequency band blocks; Before the corresponding frequency band of left figure blocks, after the corresponding frequency band of right figure blocks.
Embodiment
For the validity of checking said method, the present invention has carried out following experiment.
The database that model has comprised 1030 audio files, has contained different voice signals and music signal.Long 2 minutes of each audio file, WAVE form, sampling rate is 44.1kHz, monophony.The suspicious audio fragment of intercepting to be certified is long for 10s, intercepts at random from above original audio file.In the present embodiment, the audio content Verification System that first test is constructed is according to the method described above for authentication percent of pass (the True Positive Rates that keeps content operation, TPR), and distort authentification failure rate (the True Negative Rates of operation for malice, TNR), the two is all considered the accuracy rate of Verification System, and the two numerical value is larger, all shows that the accuracy rate of system is higher.
Table 1. is to revise type and corresponding TPR/TNR thereof.
Table 1.
Result shows, keeping content operation MP3 compression (32kbps), TSM ± 10%, ± modified tone 20% below and under the low-pass filtering of 4kHz and 8kHz, the authentication percent of pass of system remains at least 81% accuracy rate level; When TSM bring up to ± 20% time, accuracy rate declines to some extent, but still remains on 80% left and right.Even if it is pointed out that without any attack, the Verification System being built by SIFT characteristic matching symbol also cannot guarantee 100% authentication accuracy rate, in other words, detects the audio fragment of 1030 unmodified, and 968 audio frequency of only having an appointment can be by authentication.Reason is that fragment to be certified can not be navigated to its accurate location in reference audio signal by coupling to entirely accurate, still can authentification failure without any distortion in the situation that thereby cause.
For the detection of the tampering in three kinds of time domains (replace, shear and insert), authentification failure rate is respectively 99.4%, 99.6% and 100%, has proved the validity of time contraction-expansion factor when identification time domain is distorted operation; And when the malice bandwidth blocking operation (barrage width is respectively 800 ~ 900Hz, 1500 ~ 1600Hz) detecting in frequency, 82.9% effect being compared in time domain with 76.7% accuracy rate is relative lower.
The present invention has detected two error statistics data important in actual authentication system, and wherein, one is FPR (False Positive Rate), refers to be mistaken for the ratio that keeps content, i.e. Error type I by distorting; Another is FNR (FalseNegative Rate), refers to keep content to be mistaken for the ratio of distorting, i.e. error type II.By the confusion matrix of table 2, carry out the index of illustrative system.
The confusion matrix of table 2. authentication error rate
Table 2 has shown and in content operations, have and by mistake, be judged to be and distort for 2340 times, so FNR is 0.1623 altogether keeping for 14420 times; Whether reason is to keep the operation of content can cause the time-frequency representation of fragment to change, and has influence on the accuracy of fragment positioned in alignment, simultaneously due to the demand authenticating, on threshold value setting, for keeping the boundary member of content to tend to make the judgement of distorting.In 3090 times distort altogether, have and by mistake, be judged to be maintenance content 10 times, so FPR is 0.00324.From the matrix of table 2, utilize the audio authentication system of this method effectively to differentiate to distort and keep content operation.
Claims (2)
1. the audio fragment authentication method based on frequency spectrum SIFT Feature Descriptor, it is characterized in that, it comprises: the fragment alignment step (1 ~ 4) based on sound spectrograph SIFT local description, robust hashing value calculation procedure (5 ~ 6) and authentication decision step (7):
Step 1, is used Short Time Fourier Transform (STFT) to convert one dimension sound signal to corresponding two-dimentional time-frequency representation, gets the medium and low frequency section of 100 ~ 3000Hz;
Step 2, calculated characteristics descriptor
The 128 dimension SIFT Feature Descriptors that calculate respectively suspicious sound signal and reference audio signal, obtain crucial match point by comparing two groups of descriptors, and establishing coupling logarithm is N, is designated as
the suspicious audio frequency high order end of coupling centering, low order end match point and reference audio high order end, low order end match point are expressed as,
Step 3, the length of establishing suspicious audio fragment is L
d, the length of reference audio is L
r, the distance of audio fragment left margin and leftmost side SIFT unique point is
the distance of right margin and rightmost side SIFT unique point is
mapping distance in corresponding reference audio
with
by formula (1), obtain,
Wherein
Step 4, by the ascending order arrangement in chronological order of SIFT key point, locate by formula (2) position of suspicious audio fragment in reference audio,
L wherein
r, SR
r, P
rrespectively frame number, sampling rate and the time span (second) of reference audio;
Step 5, for the pre-service of audio frequency attack:
Time domain partitioning process: respectively suspicious audio frequency and reference audio are divided into a plurality of fritters, make time domain zoom factor approximately equal in piece, thus be convenient to find fragment that malice is sheared or inserted and convenient after robust Hash calculate;
Frequency alignment: mate calculating the modified tone factor by SIFT key point, thereby calculate the corresponding relation of frequency between suspicious audible spectrum and reference audio frequency spectrum;
Step 6, robust Hash calculates
Adopt Philips method to carry out the calculation of Hash yardage; Each is organized to corresponding piecemeal pair
its fragment length
and frequency range
first use formula (3) to adjust,
Wherein,
with
represent respectively corresponding piecemeal
zoom factor and average modified tone factor averaging time;
Make E
{ D, R}, (k, n) represents that in frequency spectrum, being positioned at k organizes frequency range, the energy of n time frame; By frequency band
be divided into 33 nonoverlapping sub-bands, the 32 bit Hash codes in each region adopt formula (4) to calculate,
Step 7, revises type detection
Malice shearing/Insert Fragment: by detection time zoom factor curve whether have concave point or salient point, judge whether audio file suffers that malice is sheared or the modification of insertion;
Malice frequency modification: constructing respectively according to SIFT key point can suspect signal and the histogram of reference signal, relatively judges from histogram whether apocrypha suffers malice frequency modification;
Content modification: utilize bit error rate to judge whether file suffers the content modification of malice, and formula has defined threshold value T and decision rule in (5),
If BER≤T, authentication is passed through, and represents that examined audio frequency does not suffer maliciously to distort, and the content integrity of file is good; If BER > is T, authentification failure, represents that audio frequency is maliciously tampered.
2. by method claimed in claim 1, it is characterized in that the attack of described step 5: time-scaling, modified tone, time domain are sheared or inserted.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210389030.7A CN103730128A (en) | 2012-10-13 | 2012-10-13 | Audio clip authentication method based on frequency spectrum SIFT feature descriptor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210389030.7A CN103730128A (en) | 2012-10-13 | 2012-10-13 | Audio clip authentication method based on frequency spectrum SIFT feature descriptor |
Publications (1)
Publication Number | Publication Date |
---|---|
CN103730128A true CN103730128A (en) | 2014-04-16 |
Family
ID=50454174
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210389030.7A Pending CN103730128A (en) | 2012-10-13 | 2012-10-13 | Audio clip authentication method based on frequency spectrum SIFT feature descriptor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103730128A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104134443A (en) * | 2014-08-14 | 2014-11-05 | 兰州理工大学 | Symmetrical ternary string represented voice perception Hash sequence constructing and authenticating method |
CN104361889A (en) * | 2014-10-28 | 2015-02-18 | 百度在线网络技术(北京)有限公司 | Audio file processing method and device |
CN105227311A (en) * | 2014-07-01 | 2016-01-06 | 腾讯科技(深圳)有限公司 | Verification method and system |
CN107785023A (en) * | 2016-08-25 | 2018-03-09 | 财团法人资讯工业策进会 | Voiceprint identification device and voiceprint identification method thereof |
CN108665905A (en) * | 2018-05-18 | 2018-10-16 | 宁波大学 | A kind of digital speech re-sampling detection method based on band bandwidth inconsistency |
CN108766464A (en) * | 2018-06-06 | 2018-11-06 | 华中师范大学 | Digital audio based on mains frequency fluctuation super vector distorts automatic testing method |
CN109284717A (en) * | 2018-09-25 | 2019-01-29 | 华中师范大学 | It is a kind of to paste the detection method and system for distorting operation towards digital audio duplication |
CN115798490A (en) * | 2023-02-07 | 2023-03-14 | 西华大学 | Audio watermark implantation method and device based on SIFT |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1169199A (en) * | 1995-01-26 | 1997-12-31 | 苹果电脑公司 | System and method for generating and using context dependent subsyllable models to recognize a tonal language |
US20120132056A1 (en) * | 2010-11-29 | 2012-05-31 | Wang Wen-Nan | Method and apparatus for melody recognition |
-
2012
- 2012-10-13 CN CN201210389030.7A patent/CN103730128A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1169199A (en) * | 1995-01-26 | 1997-12-31 | 苹果电脑公司 | System and method for generating and using context dependent subsyllable models to recognize a tonal language |
US20120132056A1 (en) * | 2010-11-29 | 2012-05-31 | Wang Wen-Nan | Method and apparatus for melody recognition |
Non-Patent Citations (2)
Title |
---|
XIANGYANG XUE ET AL.: "Towards content-based audio fragment authentication", 《MM "11: PROCEEDINGS OF THE 19TH ACM INTERNATIONAL CONFERENCE ON MULTIMEDIA 》, 31 December 2011 (2011-12-31), pages 1249 - 1252 * |
唐朝伟等: "一种改进的SIFT描述子及其性能分析", 《武汉大学学报·信息科学版》, vol. 37, no. 1, 31 January 2012 (2012-01-31), pages 11 - 16 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105227311A (en) * | 2014-07-01 | 2016-01-06 | 腾讯科技(深圳)有限公司 | Verification method and system |
CN104134443A (en) * | 2014-08-14 | 2014-11-05 | 兰州理工大学 | Symmetrical ternary string represented voice perception Hash sequence constructing and authenticating method |
CN104134443B (en) * | 2014-08-14 | 2017-02-08 | 兰州理工大学 | Symmetrical ternary string represented voice perception Hash sequence constructing and authenticating method |
CN104361889A (en) * | 2014-10-28 | 2015-02-18 | 百度在线网络技术(北京)有限公司 | Audio file processing method and device |
CN104361889B (en) * | 2014-10-28 | 2018-03-16 | 北京音之邦文化科技有限公司 | Method and device for processing audio file |
CN107785023A (en) * | 2016-08-25 | 2018-03-09 | 财团法人资讯工业策进会 | Voiceprint identification device and voiceprint identification method thereof |
CN108665905A (en) * | 2018-05-18 | 2018-10-16 | 宁波大学 | A kind of digital speech re-sampling detection method based on band bandwidth inconsistency |
CN108766464A (en) * | 2018-06-06 | 2018-11-06 | 华中师范大学 | Digital audio based on mains frequency fluctuation super vector distorts automatic testing method |
CN108766464B (en) * | 2018-06-06 | 2021-01-26 | 华中师范大学 | Digital audio tampering automatic detection method based on power grid frequency fluctuation super vector |
CN109284717A (en) * | 2018-09-25 | 2019-01-29 | 华中师范大学 | It is a kind of to paste the detection method and system for distorting operation towards digital audio duplication |
CN115798490A (en) * | 2023-02-07 | 2023-03-14 | 西华大学 | Audio watermark implantation method and device based on SIFT |
CN115798490B (en) * | 2023-02-07 | 2023-04-21 | 西华大学 | Audio watermark implantation method and device based on SIFT transformation |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103730128A (en) | Audio clip authentication method based on frequency spectrum SIFT feature descriptor | |
Renza et al. | Authenticity verification of audio signals based on fragile watermarking for audio forensics | |
US6674861B1 (en) | Digital audio watermarking using content-adaptive, multiple echo hopping | |
US6738744B2 (en) | Watermark detection via cardinality-scaled correlation | |
KR100492743B1 (en) | Method for inserting and detecting watermark by a quantization of a characteristic value of a signal | |
Özer et al. | Perceptual audio hashing functions | |
Dhar et al. | Audio watermarking in transform domain based on singular value decomposition and Cartesian-polar transformation | |
Chen et al. | Perceptual audio hashing algorithm based on Zernike moment and maximum-likelihood watermark detection | |
Dhar | A blind audio watermarking method based on lifting wavelet transform and QR decomposition | |
Li et al. | Audio-lossless robust watermarking against desynchronization attacks | |
Huang et al. | A reversible acoustic steganography for integrity verification | |
Huang et al. | A new approach of reversible acoustic steganography for tampering detection | |
CN104091104B (en) | Multi-format audio perceives the characteristics extraction of Hash certification and authentication method | |
Wang et al. | Tampering Detection Scheme for Speech Signals using Formant Enhancement based Watermarking. | |
CN101609675B (en) | Fragile audio frequency watermark method based on mass center | |
Su et al. | Window switching strategy based semi-fragile watermarking for MP3 tamper detection | |
Li et al. | Music content authentication based on beat segmentation and fuzzy classification | |
Zmudzinski et al. | Perception-based audio authentication watermarking in the time-frequency domain | |
Li et al. | Content based JPEG fragmentation point detection | |
Zhang et al. | An encrypted speech authentication method based on uniform subband spectrumvariance and perceptual hashing | |
Karnjana et al. | Audio Watermarking Scheme Based on Singular Spectrum Analysis and Psychoacoustic Model with Self‐Synchronization | |
Masmoudi et al. | MP3 Audio watermarking using calibrated side information features for tamper detection and localization | |
Xue et al. | Towards content-based audio fragment authentication | |
Zmudzinski et al. | Digital audio authentication by robust feature embedding | |
Jiao et al. | Key-dependent compressed domain audio hashing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20140416 |