CN112364386A - Audio tampering detection and recovery method combining compressed sensing and DWT - Google Patents

Audio tampering detection and recovery method combining compressed sensing and DWT Download PDF

Info

Publication number
CN112364386A
CN112364386A CN202011132924.9A CN202011132924A CN112364386A CN 112364386 A CN112364386 A CN 112364386A CN 202011132924 A CN202011132924 A CN 202011132924A CN 112364386 A CN112364386 A CN 112364386A
Authority
CN
China
Prior art keywords
formula
tampered
information
audio
watermark information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011132924.9A
Other languages
Chinese (zh)
Other versions
CN112364386B (en
Inventor
胡洋霞
魏建国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN202011132924.9A priority Critical patent/CN112364386B/en
Publication of CN112364386A publication Critical patent/CN112364386A/en
Application granted granted Critical
Publication of CN112364386B publication Critical patent/CN112364386B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/64Protecting data integrity, e.g. using checksums, certificates or signatures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/10Protecting distributed programs or content, e.g. vending or licensing of copyrighted material ; Digital rights management [DRM]
    • G06F21/16Program or content traceability, e.g. by watermarking

Abstract

The invention discloses an audio tampering detection and recovery method combining compressed sensing and DWT, which comprises the following steps: original audio is framed; extracting DCT coefficients of each frame; connecting the compressed coefficients of each frame; linear transformation to obtain unquantized reference values; calculating a matrix A and changing floating point type watermark information into integer type to obtain watermark information to be embedded; embedding watermark information into a high-frequency region of discrete wavelet transform of an original audio; extracting watermark information, judging whether the audio information is tampered or not, and positioning a tampered area; discarding watermark information of the tampered area, and extracting effective watermark information of the area which is not tampered to obtain an approximate value of a quantized reference value; and obtaining the compressed information of the damaged area, decompressing, performing inverse discrete wavelet transform on the compressed information, and connecting the undamaged area to obtain a restored voice signal. The invention improves the peak signal-to-noise ratio of the embedded watermark information and the intelligibility of the audio frequency after the tampering recovery, and has simpler calculation process and more accurate tampering positioning.

Description

Audio tampering detection and recovery method combining compressed sensing and DWT
Technical Field
The invention belongs to the field of multimedia and signal processing, particularly relates to a method for researching the authentication and protection problem of voice signal integrity, and more particularly relates to an audio tampering detection and recovery method combining compressed sensing and DWT.
Background
Aiming at the problems of integrity authentication and protection of multimedia signals, a plurality of methods adopting watermarks exist, but most of the methods are based on images or videos, the effect based on voice is not ideal, the peak signal-to-noise ratio after recovery is still to be improved, because a human auditory perception system is more sensitive than a visual perception system, the signal needing to be embedded with the watermark has better imperceptibility, and meanwhile, the recovered voice is easier to be understood by people.
Disclosure of Invention
The invention aims to overcome the defects in the prior art, provides an audio tampering detection and recovery method combining compressed sensing and DWT, improves the peak signal-to-noise ratio of embedded watermark information and the intelligibility of tampered and recovered audio, generates watermark information by calculating a quantized reference value, and has the advantages of simpler calculation process and more accurate tampering and positioning.
The purpose of the invention is realized by the following technical scheme.
The invention discloses an audio tampering detection and recovery method combining compressed sensing and DWT, which comprises the following processes:
firstly, framing original audio;
step two, extracting DCT coefficients of each frame of the original audio;
connecting the compressed coefficients of each frame to obtain a formula (1);
Figure BDA0002735734310000011
wherein the content of the first and second substances,
Figure BDA0002735734310000012
v is a vector obtained by connecting each frame coefficient after grouping and rearrangement for each compressed coefficient of each frame of the original signal;
performing linear transformation on the vector in the formula (1), and calculating according to the formula (2) to obtain an unquantized reference value;
Figure BDA0002735734310000021
wherein r is an unquantized reference value, k is the number of the reference values, and the dimension of the matrix A is determined according to the random number seed and the number of the groups;
step five, calculating a matrix A according to a formula (3), and changing floating-point type watermark information into integer type according to a formula (4) and a formula (5) to obtain watermark information to be embedded;
Figure BDA0002735734310000022
wherein A is0Generated from random number seeds, A (i, j) and A0(i, j) are matrices A and A0Each compressed frame group contains n × m elements;
Figure BDA0002735734310000023
wherein the content of the first and second substances,
f(t)=q/Rmax·t (5)
(4) and (5) quantized integer information
Figure BDA0002735734310000025
As watermark information, RmaxIs the maximum value after the quantization and,
Figure BDA0002735734310000024
the value of the function corresponding to the maximum value after quantization is the coefficient value of the sampling point at the position in the audio signal, and q is a quantization parameter;
step six, according to a self-adaptive embedding algorithm, embedding watermark information into a high-frequency region of discrete wavelet transform of original audio, and finishing watermark embedding work, wherein w is watermark information, and alpha is embedding strength;
step seven, extracting watermark information according to the reverse process of watermark embedding, and judging that the audio of the Speech type is not tampered if the frame number tampered in a group does not influence the understanding of semantics, so that a judgment threshold value delta is set according to the Speech speed of the Speech and the set frame duration, whether the Speech information is tampered is judged by comparing the judgment threshold value delta with the delta, a tampered area is positioned, and the auditory effect is influenced when the audio of other types is tampered in the frame group, and the tampered area is directly positioned; the positioning process is shown in formula (6);
Figure BDA0002735734310000031
wherein p (i, j) represents the ith frame of the jth group, m is the number of frames, n is the number of groups, w' (i, j) is the extracted watermark, and w (i, j) is the generated watermark; for the audio of the Speech type, if the number of damaged frames in the jth group is smaller than a judgment threshold value delta, automatically judging the next group, if the number of damaged frames in the jth group is larger than the judgment threshold value delta, judging the group as a tampered group, and positioning a tampered area according to the value of i; other types of audio locate the tampered area directly from the values of i and j;
step eight, discarding the watermark information of the tampered area, extracting effective watermark information of the area which is not tampered, and obtaining an approximate value of the quantized reference value according to a formula (7) and a formula (8);
Figure BDA0002735734310000032
wherein, the sequence alpha12,...,αMIs a reference value extracted from an uncorrupted frame, A(E)Is the matrix after deleting the rows of reference values of the damaged area in a;
Figure BDA0002735734310000033
wherein v isRCorresponding to uncorrupted information in the compression matrix v, vTInformation corresponding to a damaged area in the compression matrix v; a. the(E,R)And A(E,T)Are respectively A(E)Corresponding vRAnd vTThe row in (1);
step nine, according to the formula (4), embeddingThe method embeds the quantized reference values into the original signal, so that the reference values extracted by the extraction method are all the quantized result, not the sequence alpha12,...,αM(ii) a And the quantized vector is processed to obtain a sequence alpha12,...,αMThe approximate value of (a) is calculated according to the formulas (9), (10) and (11);
Figure BDA0002735734310000041
wherein, formula (9) is the inverse process of formula (4); rmax、fx
Figure BDA0002735734310000042
Obtained from (4) and (5);
Figure BDA0002735734310000043
Figure BDA0002735734310000044
wherein r' (α) is calculated1),r′(α2),…,r′(αM) Obtained vector of α'1,α'2,...,α'MIs a processed extracted reference value which can be approximated as an original, non-over-quantized reference value;
step ten, obtaining the compressed information of the damaged area according to the formula (12) and the formula (13), decompressing, performing inverse discrete wavelet transform on the compressed information, and connecting the undamaged area to obtain a recovered voice signal;
Figure BDA0002735734310000045
the information approximation at the tamper is written as:
Figure BDA0002735734310000051
wherein, the formula (12) and the formula (13) are a decompression process, and S is obtained by the formula (12)1,S2,…,SMThe compression amount v is obtained by the formula (13)TDecompression vTAnd obtaining a restored signal sequence, and finally obtaining a restored signal through inverse discrete wavelet transform and splicing the signal sequence.
Compared with the prior art, the technical scheme of the invention has the following beneficial effects:
the method is based on the compressed sensing model and the DWT, so that the peak signal-to-noise ratio of the embedded watermark information and the intelligibility of the audio frequency after the tampering recovery are improved, the watermark information is generated by calculating the quantized reference value, the calculation process is simpler, and the tampering positioning is more accurate; the sensitivity of the auditory system and the characteristics of fragile watermarks are fully considered, and the method has better practicability.
Drawings
Figure 1 is a diagram of a watermark embedding process,
wherein, diagram (a) is the whole embedding process, diagram (b) is the adaptive embedding process schematic diagram;
FIG. 2 is a tamper detection location and recovery process;
figure 3 is an example of a spectrum before and after watermark embedding,
wherein, the graph (a) is an original audio frequency spectrogram, and the graph (b) is a watermarked audio frequency spectrogram;
fig. 4 is an example of tamper detection and repair, wherein,
graph (a) is a diagram of the original audio spectrum with the watermark, graph (b) is a diagram of the audio spectrum with 20% replaced,
fig. (c) is a restored audio frequency spectrum diagram.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention combines the compressed sensing and the audio tampering detection and recovery method of DWT, adopts the principle of compressed sensing, extracts the Discrete Cosine Transform (DCT) coefficient of the original audio information, forms a reference value through calculation, takes the quantized reference value as the watermark information, and embeds the watermark information into the high-frequency area of the discrete wavelet transform of the original audio, thereby increasing the embedding amount, improving the peak signal-to-noise ratio of the watermark information, simultaneously keeping the fragility of the watermark, positioning the tampered area more accurately, and making the recovered voice more approximate to the original audio. The specific technical scheme is as follows:
step one, framing an original audio.
And step two, extracting the DCT coefficient of each frame of the original audio.
And step three, connecting the compressed coefficients of each frame to obtain a formula (1).
Figure BDA0002735734310000061
Wherein the content of the first and second substances,
Figure BDA0002735734310000062
v is a vector obtained by connecting the coefficients of each frame after grouping and rearrangement.
And step four, performing linear transformation on the vector in the formula (1), and calculating according to the formula (2) to obtain an unquantized reference value.
Figure BDA0002735734310000063
Where r is an unquantized reference value, k is the number of reference values, and the dimension of matrix a is determined by the random number seed and the number of groups.
And step five, calculating the matrix A according to the formula (3), and changing the floating-point watermark information into integer according to the formulas (4) and (5) to obtain the watermark information to be embedded.
Figure BDA0002735734310000064
Wherein A is0Is generated from a random number of seeds Ai, j) and A0(i, j) are matrices A and A0Each compressed frame group contains n × m elements.
Figure BDA0002735734310000065
Wherein the content of the first and second substances,
f(t)=q/Rmax·t (5)
(4) and (5) quantized integer information
Figure BDA0002735734310000074
As watermark information, RmaxIs the maximum value after the quantization and,
Figure BDA0002735734310000071
the function value corresponding to the quantized maximum value is the coefficient value of the sampling point at the position in the audio signal, and q is a quantization parameter.
Step six, according to the self-adaptive embedding algorithm shown in fig. 1(b), embedding the watermark information into the high-frequency region of the discrete wavelet transform of the original audio, and finishing the watermark embedding work, wherein w is the watermark information and alpha is the embedding strength.
And seventhly, extracting watermark information according to the reverse process of watermark embedding, and for the Speech type audio, if the number of tampered frames in a group does not influence the understanding of semantics, judging that the audio is not tampered, setting a judgment threshold delta according to the Speech speed of the Speech and the set duration of the frames, judging whether the Speech information is tampered by comparing the judgment threshold delta with the delta, and positioning a tampered area. In other types of audio (Pop, Jazz, Rock, Blues, Classic), tampering within a frame group affects the auditory effect, thus directly locating the tampered area. The positioning process is shown in equation (6):
Figure BDA0002735734310000072
where p (i, j) denotes the ith frame of the jth group, m is the number of frames, n is the number of groups, w' (i, j) is the extracted watermark, and w (i, j) is the watermark generated according to fig. 2. For the audio of the Speech type, if the number of damaged frames in the jth group is smaller than a judgment threshold value delta, automatically judging the next group, if the number of damaged frames in the jth group is larger than the judgment threshold value delta, judging the group as a tampered group, and positioning a tampered area according to the value of i; other types of audio locate the tampered area directly from the values of i and j.
And step eight, discarding the watermark information of the tampered area, extracting effective watermark information of the area which is not tampered, and obtaining an approximate value of the quantized reference value according to the formula (7) and the formula (8).
Figure BDA0002735734310000073
Wherein, the sequence alpha12,...,αMIs a reference value extracted from an uncorrupted frame, A(E)The matrix is after deleting the rows of reference values of the damaged area in a.
Figure BDA0002735734310000081
Wherein v isRCorresponding to uncorrupted information in the compression matrix v, vTInformation corresponding to a damaged area in the compression matrix v; a. the(E,R)And A(E,T)Are respectively A(E)Corresponding vRAnd vTRow (c).
Step nine, according to the formula (4), the embedding party embeds the quantized reference values into the original signal, so that the reference values extracted by the extracting party are all the quantized results, not the sequence alpha12,...,αM. And after the quantized vector is processed, a sequence alpha can be obtained12,...,αMThe approximate value of (2) can be obtained by calculation according to the formulas (9), (10) and (11);
Figure BDA0002735734310000082
wherein, the formula (9) is the inverse process of the formula (4). Rmax、fx
Figure BDA0002735734310000083
Obtained from (4) and (5),
Figure BDA0002735734310000084
Figure BDA0002735734310000085
wherein r' (α) is calculated1),r′(α2),…,r′(αM) Obtained vector of α'1,α'2,...,α'MIs a processed extracted reference value that can be approximated as the original, un-emphasized reference value.
Step ten, obtaining the compressed information of the damaged area according to the formula (12) and the formula (13), decompressing, performing inverse discrete wavelet transform on the compressed information, and connecting the undamaged area to obtain a recovered voice signal;
Figure BDA0002735734310000091
the information approximation at the tamper is written as:
Figure BDA0002735734310000092
wherein, the formula (12) and the formula (13) are a decompression process, and S is obtained by the formula (12)1,S2,…,SMThe compression amount v is obtained by the formula (13)TDecompression vTAnd obtaining a restored signal sequence, and finally obtaining a restored signal through inverse discrete wavelet transform and splicing the signal sequence.
Example (b):
as shown in fig. 1, the first step of the present invention is to divide the original audio into frames, in the experiment, the sampling frequency is 8000Hz, the sampling precision is 16bits, 200 adjacent sampling points are divided into one frame, there is no overlap between frames, and each frame group contains 20 frames by taking the frame group as a unit. DWT is carried out on each frame of audio to obtain a high-frequency wavelet coefficient and a low-frequency wavelet coefficient, each frame is compressed, a DCT coefficient of each frame is extracted in an experiment, and a quantized reference value is calculated according to the formulas (1) - (5) to obtain watermark information. According to the illustration of fig. 1, watermark information is embedded in a high frequency region, the embedding strength α is 0.01, and IDWT is performed simultaneously with a low frequency region to obtain an audio signal containing a watermark.
By analyzing the audio signal containing the watermark by the method, as shown in fig. 3, the spectrogram before and after embedding the watermark is basically the same. Subjective auditory identification experiments were performed and the signal-to-noise ratio was calculated and found to be correct from 47% to 54%, close to 50%, with an average signal-to-noise ratio of 45.76dB, at least 5dB higher than other algorithms.
As shown in fig. 2, the watermark is extracted, and tamper detection positioning and recovery are performed. Firstly, framing a received audio signal, performing the same embedding process as the method, then performing DWT to obtain a low-frequency coefficient and a high-frequency coefficient, extracting a watermark according to the method shown in FIG. 2, performing XOR operation with the watermark generated in the embedding process as shown in a formula (6), obtaining the sum of XOR values, comparing the sum with a detection threshold, judging whether Speech is tampered, and positioning the tampered frame by taking a group as a unit, wherein the detection threshold is 20. And recovering the tampered area, wherein the method takes a frame group as a unit, if the voice frames in one group are not damaged, the next frame group is jumped to, and each group is assumed to contain m frames, each frame contains n sampling points, and k is the number of reference values generated by the random number seeds. If z speech frames in a group are corrupted, the reference values in the corrupted frames are discarded, and the number of reference values that can be extracted from the group is
Figure BDA0002735734310000101
(alphabetical meaning). Obtaining v from equations (7) and (8)RSince the extracted reference value is a passing amountFor this reason, the reference value sequence α 'is obtained by processing the reference value according to equations (9), (10) and (11) and obtaining the reference value sequence after quantization'1,α'2,...,α'M(alphabetical meaning) may be approximately equal to the reference value. Obtaining v from equations (14) and (15)TDecompressing, selecting information of damaged area, doing IDWT, connecting undamaged voice, and obtaining recovered voice.
By adopting the method, 20% of the content of a section of audio is replaced, and the audio is detected and restored, as shown in fig. 4, the restored spectrogram is approximately an undamaged spectrogram, 5 sentences in a database CASIA-863 are selected, 100 destructive experiments are randomly carried out, 98% of sentences can clearly hear the content of a damaged area after being restored by the method, and the average SNR is 23.9 dB; when the destruction rate is as high as 50%, the SNR is 10.32dB on average, and 80% of the content of the sentence destruction part is understandable after being restored. The SNR of the recovered voice is calculated in experiments, and the SNR value is found to be in a descending trend along with the increase of the destruction rate, but is averagely higher by 3-4dB than other methods.
While the present invention has been described in terms of its functions and operations with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise functions and operations described above, and that the above-described embodiments are illustrative rather than restrictive, and that various changes and modifications may be effected therein by one skilled in the art without departing from the scope or spirit of the invention as defined by the appended claims.

Claims (1)

1. An audio tamper detection and recovery method combining compressed sensing and DWT, characterized by comprising the following processes:
firstly, framing original audio;
step two, extracting DCT coefficients of each frame of the original audio;
connecting the compressed coefficients of each frame to obtain a formula (1);
Figure FDA0002735734300000011
wherein the content of the first and second substances,
Figure FDA0002735734300000014
v is a vector obtained by connecting each frame coefficient after grouping and rearrangement for each compressed coefficient of each frame of the original signal;
performing linear transformation on the vector in the formula (1), and calculating according to the formula (2) to obtain an unquantized reference value;
Figure FDA0002735734300000012
wherein r is an unquantized reference value, k is the number of the reference values, and the dimension of the matrix A is determined according to the random number seed and the number of the groups;
step five, calculating a matrix A according to a formula (3), and changing floating-point type watermark information into integer type according to a formula (4) and a formula (5) to obtain watermark information to be embedded;
Figure FDA0002735734300000013
wherein A is0Generated from random number seeds, A (i, j) and A0(i, j) are matrices A and A0Each compressed frame group contains n × m elements;
Figure FDA0002735734300000021
wherein the content of the first and second substances,
f(t)=q/Rmax·t (5)
(4) and (5) quantized integer information
Figure FDA0002735734300000024
As watermark information, RmaxIs the maximum value after the quantization and,
Figure FDA0002735734300000022
the value of the function corresponding to the maximum value after quantization is the coefficient value of the sampling point at the position in the audio signal, and q is a quantization parameter;
step six, according to a self-adaptive embedding algorithm, embedding watermark information into a high-frequency region of discrete wavelet transform of original audio, and finishing watermark embedding work, wherein w is watermark information, and alpha is embedding strength;
step seven, extracting watermark information according to the reverse process of watermark embedding, and judging that the audio of the Speech type is not tampered if the frame number tampered in a group does not influence the understanding of semantics, so that a judgment threshold value delta is set according to the Speech speed of the Speech and the set frame duration, whether the Speech information is tampered is judged by comparing the judgment threshold value delta with the delta, a tampered area is positioned, and the auditory effect is influenced when the audio of other types is tampered in the frame group, and the tampered area is directly positioned; the positioning process is shown in formula (6);
Figure FDA0002735734300000023
wherein p (i, j) represents the ith frame of the jth group, m is the number of frames, n is the number of groups, w' (i, j) is the extracted watermark, and w (i, j) is the generated watermark; for the audio of the Speech type, if the number of damaged frames in the jth group is smaller than a judgment threshold value delta, automatically judging the next group, if the number of damaged frames in the jth group is larger than the judgment threshold value delta, judging the group as a tampered group, and positioning a tampered area according to the value of i; other types of audio locate the tampered area directly from the values of i and j;
step eight, discarding the watermark information of the tampered area, extracting effective watermark information of the area which is not tampered, and obtaining an approximate value of the quantized reference value according to a formula (7) and a formula (8);
Figure FDA0002735734300000031
wherein, the sequence alpha12,...,αMIs a reference value extracted from an uncorrupted frame, A(E)Is the matrix after deleting the rows of reference values of the damaged area in a;
Figure FDA0002735734300000032
wherein v isRCorresponding to uncorrupted information in the compression matrix v, vTInformation corresponding to a damaged area in the compression matrix v; a. the(E,R)And A(E,T)Are respectively A(E)Corresponding vRAnd vTThe row in (1);
step nine, according to the formula (4), the embedding party embeds the quantized reference values into the original signal, so that the reference values extracted by the extracting party are all the quantized results, not the sequence alpha12,...,αM(ii) a And the quantized vector is processed to obtain a sequence alpha12,...,αMThe approximate value of (a) is calculated according to the formulas (9), (10) and (11);
Figure FDA0002735734300000033
wherein, formula (9) is the inverse process of formula (4); rmax、fx
Figure FDA0002735734300000034
Obtained from (4) and (5);
Figure FDA0002735734300000035
Figure FDA0002735734300000041
wherein r' (α) is calculated1),r′(α2),…,r′(αM) Obtained vector of α'1,α'2,...,α'MIs a processed extracted reference value which can be approximated as an original, non-over-quantized reference value;
step ten, obtaining the compressed information of the damaged area according to the formula (12) and the formula (13), decompressing, performing inverse discrete wavelet transform on the compressed information, and connecting the undamaged area to obtain a recovered voice signal;
Figure FDA0002735734300000042
the information approximation at the tamper is written as:
Figure FDA0002735734300000043
wherein, the formula (12) and the formula (13) are a decompression process, and S is obtained by the formula (12)1,S2,…,SMThe compression amount v is obtained by the formula (13)TDecompression vTAnd obtaining a restored signal sequence, and finally obtaining a restored signal through inverse discrete wavelet transform and splicing the signal sequence.
CN202011132924.9A 2020-10-21 2020-10-21 Audio tampering detection and recovery method combining compressed sensing and DWT Active CN112364386B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011132924.9A CN112364386B (en) 2020-10-21 2020-10-21 Audio tampering detection and recovery method combining compressed sensing and DWT

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011132924.9A CN112364386B (en) 2020-10-21 2020-10-21 Audio tampering detection and recovery method combining compressed sensing and DWT

Publications (2)

Publication Number Publication Date
CN112364386A true CN112364386A (en) 2021-02-12
CN112364386B CN112364386B (en) 2022-04-26

Family

ID=74511437

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011132924.9A Active CN112364386B (en) 2020-10-21 2020-10-21 Audio tampering detection and recovery method combining compressed sensing and DWT

Country Status (1)

Country Link
CN (1) CN112364386B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113421592A (en) * 2021-08-25 2021-09-21 中国科学院自动化研究所 Method and device for detecting tampered audio and storage medium

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090125310A1 (en) * 2006-06-21 2009-05-14 Seungjae Lee Apparatus and method for inserting/extracting capturing resistant audio watermark based on discrete wavelet transform, audio rights protection system using the same
CN102142258A (en) * 2011-03-31 2011-08-03 上海第二工业大学 Wavelet transform and Arnold based adaptive gray-scale watermark embedded method
CN102419979A (en) * 2011-11-23 2012-04-18 北京邮电大学 Audio semi-fragile watermarking algorithm for realizing precise positioning of altered area
CN103050120A (en) * 2012-12-28 2013-04-17 暨南大学 High-capacity digital audio reversible watermark processing method
CN104795071A (en) * 2015-04-18 2015-07-22 广东石油化工学院 Blind audio watermark embedding and watermark extraction processing method
CN106504757A (en) * 2016-11-09 2017-03-15 天津大学 A kind of adaptive audio blind watermark method based on auditory model
CN106531176A (en) * 2016-10-27 2017-03-22 天津大学 Digital watermarking algorithm of audio signal tampering detection and recovery
CN107993669A (en) * 2017-11-20 2018-05-04 西南交通大学 Voice content certification and tamper recovery method based on modification least significant digit weight
CN108198563A (en) * 2017-12-14 2018-06-22 安徽新华传媒股份有限公司 A kind of Multifunctional audio guard method of digital copyright protection and content authentication
CN109119086A (en) * 2017-06-24 2019-01-01 天津大学 A kind of breakable watermark voice self-restoring technology of multilayer least significant bit
CN110010142A (en) * 2019-03-28 2019-07-12 武汉大学 A kind of method of large capacity audio information hiding
CN110211016A (en) * 2018-02-28 2019-09-06 佛山科学技术学院 A kind of watermark embedding method based on convolution feature
CN111091841A (en) * 2019-12-12 2020-05-01 天津大学 Identity authentication audio watermarking algorithm based on deep learning

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090125310A1 (en) * 2006-06-21 2009-05-14 Seungjae Lee Apparatus and method for inserting/extracting capturing resistant audio watermark based on discrete wavelet transform, audio rights protection system using the same
CN102142258A (en) * 2011-03-31 2011-08-03 上海第二工业大学 Wavelet transform and Arnold based adaptive gray-scale watermark embedded method
CN102419979A (en) * 2011-11-23 2012-04-18 北京邮电大学 Audio semi-fragile watermarking algorithm for realizing precise positioning of altered area
CN103050120A (en) * 2012-12-28 2013-04-17 暨南大学 High-capacity digital audio reversible watermark processing method
CN104795071A (en) * 2015-04-18 2015-07-22 广东石油化工学院 Blind audio watermark embedding and watermark extraction processing method
CN106531176A (en) * 2016-10-27 2017-03-22 天津大学 Digital watermarking algorithm of audio signal tampering detection and recovery
CN106504757A (en) * 2016-11-09 2017-03-15 天津大学 A kind of adaptive audio blind watermark method based on auditory model
CN109119086A (en) * 2017-06-24 2019-01-01 天津大学 A kind of breakable watermark voice self-restoring technology of multilayer least significant bit
CN107993669A (en) * 2017-11-20 2018-05-04 西南交通大学 Voice content certification and tamper recovery method based on modification least significant digit weight
CN108198563A (en) * 2017-12-14 2018-06-22 安徽新华传媒股份有限公司 A kind of Multifunctional audio guard method of digital copyright protection and content authentication
CN110211016A (en) * 2018-02-28 2019-09-06 佛山科学技术学院 A kind of watermark embedding method based on convolution feature
CN110010142A (en) * 2019-03-28 2019-07-12 武汉大学 A kind of method of large capacity audio information hiding
CN111091841A (en) * 2019-12-12 2020-05-01 天津大学 Identity authentication audio watermarking algorithm based on deep learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
N. VINAY KUMAR .ETAL: ""Invisible watermarking in printed images"", 《RESEARCHGATE》 *
刘海燕 等: ""基于DWT域的脆弱性音频水印算法研究"", 《电子制作》 *
李建锋 等: ""一种基于DWT的音频水印算法"", 《电脑开发与应用》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113421592A (en) * 2021-08-25 2021-09-21 中国科学院自动化研究所 Method and device for detecting tampered audio and storage medium

Also Published As

Publication number Publication date
CN112364386B (en) 2022-04-26

Similar Documents

Publication Publication Date Title
Zhang et al. Robust Reversible Audio Watermarking Scheme for Telemedicine and Privacy Protection.
CN108648748B (en) Acoustic event detection method under hospital noise environment
CN111564163B (en) RNN-based multiple fake operation voice detection method
CN106531176B (en) The digital watermarking algorithm of audio signal tampering detection and recovery
CN112364386B (en) Audio tampering detection and recovery method combining compressed sensing and DWT
CN110968845A (en) Detection method for LSB steganography based on convolutional neural network generation
CN102063907A (en) Steganalysis method for audio spread-spectrum steganography
CN102664013A (en) Audio digital watermark method of discrete cosine transform domain based on energy selection
Luo et al. Audio postprocessing detection based on amplitude cooccurrence vector feature
CN114842034B (en) Picture true and false detection method based on amplified fuzzy operation trace
Rangding et al. Digital audio watermarking algorithm based on linear predictive coding in wavelet domain
Thanki Advanced techniques for audio watermarking
CN105895109A (en) Digital voice evidence collection and tamper recovery method based on DWT (Discrete Wavelet Transform) and DCT (Discrete Cosine Transform)
Li et al. Adaptive audio watermarking algorithm based on SNR in wavelet domain
CN108877816B (en) QMDCT coefficient-based AAC audio frequency recompression detection method
CN114171057A (en) Transformer event detection method and system based on voiceprint
Chetan et al. Audio watermarking using modified least significant bit technique
Wu et al. Adaptive audio watermarking based on SNR in localized regions
Hubballi et al. Novel DCT based watermarking scheme for digital images
Li et al. Spread-spectrum audio watermark robust against pitch-scale modification
Jalil et al. An Efficient Tamper Detection and Recovery Scheme for Attacked Speech Signal
Yang et al. An audio watermarking based on discrete cosine transform and complex cepstrum transform
CN114548221B (en) Method and system for enhancing generated data of small sample unbalanced voice database
Lan et al. A digital watermarking algorithm based on dual-tree complex wavelet transform
Deng Blind watermarking algorithm based on redistributed invariant integer wavelet transform and BP network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant