CN115831131A - Deep learning-based audio watermark embedding and extracting method - Google Patents

Deep learning-based audio watermark embedding and extracting method Download PDF

Info

Publication number
CN115831131A
CN115831131A CN202310056386.7A CN202310056386A CN115831131A CN 115831131 A CN115831131 A CN 115831131A CN 202310056386 A CN202310056386 A CN 202310056386A CN 115831131 A CN115831131 A CN 115831131A
Authority
CN
China
Prior art keywords
audio
watermark
decoder
embedding
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310056386.7A
Other languages
Chinese (zh)
Other versions
CN115831131B (en
Inventor
张卫明
刘畅
张�杰
方涵
马泽华
陈可江
俞能海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202310056386.7A priority Critical patent/CN115831131B/en
Publication of CN115831131A publication Critical patent/CN115831131A/en
Application granted granted Critical
Publication of CN115831131B publication Critical patent/CN115831131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses an audio watermark embedding and extracting method based on deep learning, which comprises the steps of firstly, embedding watermark information into carrier audio by utilizing an encoder to obtain audio containing watermarks; before the audio containing the watermark is input into a decoder, a distortion layer is inserted between the encoder and the decoder for enhancing the robustness of the audio dubbing process; the audio containing the watermark after the distortion layer is input to a decoder, and watermark information in the audio is extracted by the decoder. The method can extract the watermark information after the watermark is embedded into the target audio frequency and the target audio frequency is subjected to distortion such as noise adding, filtering, compression, resampling, requantization, dubbing and the like, thereby realizing the purposes of audio frequency divulgence and tracing and copyright protection.

Description

Deep learning-based audio watermark embedding and extracting method
Technical Field
The invention relates to the technical field of digital watermarks, in particular to an audio watermark embedding and extracting method based on deep learning.
Background
Digital watermarking has been widely studied for many years as an effective method of tracking the source of leakage and copyright protection. The two most important attributes that an audio watermark should satisfy are fidelity, which ensures normal use of watermarked audio, and robustness, which ensures that the embedded watermark can be extracted without loss even if the audio is subject to distortion (MPEG encoding, noise addition, audio re-recording, etc.). Most conventional audio watermarking methods focus on the robustness of digital distortion in the electronic channel, since most audio reproduction occurs in the digital channel. However, with the miniaturization of recording devices, audio Re-recording (AR) has become a more convenient and efficient way to copy Audio, when Audio is used as a carrier for information transmission, for many important confidential Audio information (litigation recording, forensic Audio) and paid Audio piracy (network classroom Audio, movie piracy), since the dubbing can effectively retain Audio content and significantly destroy embedded watermark signals, by means of the dubbing, an attacker can easily and covertly perform Audio content information stealing and cannot leave evidence easily, as shown in fig. 1, which is a schematic diagram of leaked information in the prior art for dubbing operation, how to maintain robustness under a complex scene is one of the greatest challenges for Audio watermarking, and it is ensured that robustness to dubbing becomes a critical task for Audio watermarking at present.
At present, the field of audio watermark research still mainly uses a traditional mathematical algorithm to try to find features which are unchanged before and after distortion to embed the watermark, and most of the used features are in a transform domain, for example, the transform domain features of audio are obtained by using audio frequency domain conversion methods such as Discrete Cosine Transform (DCT), discrete Wavelet Transform (DWT), fast Fourier Transform (FFT) and the like. However, due to the complexity of the dubbing process itself, it is a very difficult task to quantitatively and finely analyze the distortion and find robust and invariant features in the process, and thus none of the prior art algorithms is very resistant to dubbing distortion.
Disclosure of Invention
The invention aims to provide an audio watermark embedding and extracting method based on deep learning, which can extract watermark information in a target audio after the watermark is embedded into the target audio and the target audio is subjected to distortion such as noise adding, filtering, compression, resampling, requantization, dubbing and the like, thereby realizing the purposes of audio leakage tracing and copyright protection.
The purpose of the invention is realized by the following technical scheme:
a method of deep learning based audio watermark embedding extraction, the method comprising:
step 1, embedding watermark information into carrier audio by using an encoder to obtain audio containing a watermark;
step 2, before the audio containing the watermark is input into a decoder, a distortion layer is inserted between the encoder and the decoder for enhancing the robustness of the audio copying process;
and 3, inputting the audio containing the watermark after the distortion layer into a decoder, and extracting the watermark information by the decoder.
According to the technical scheme provided by the invention, after the watermark is embedded into the target audio, the watermark information in the target audio can still be extracted after the target audio is subjected to distortion such as noise addition, filtering, compression, resampling, requantization, dubbing and the like, so that the aims of audio leakage tracing and copyright protection are fulfilled.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on the drawings without creative efforts.
Fig. 1 is a schematic diagram of a dubbing operation on leaked information in the prior art.
Fig. 2 is a flowchart of a method for embedding and extracting an audio watermark based on deep learning according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention are clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, rather than all of the embodiments, and this does not limit the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings. Details which are not described in detail in the embodiments of the invention belong to the prior art which is known to the person skilled in the art. Those not specifically mentioned in the examples of the present invention were carried out according to the conventional conditions in the art or conditions suggested by the manufacturer. The reagents or instruments used in the examples of the present invention are not specified by manufacturers, and are all conventional products available by commercial purchase.
Fig. 2 is a schematic flow chart of a method for embedding and extracting an audio watermark based on deep learning according to an embodiment of the present invention, where the method includes:
step 1, embedding watermark information into a carrier audio by using an encoder to obtain an audio containing a watermark;
in this step, in step 1, with
Figure SMS_1
To express a length of
Figure SMS_2
The mono original carrier audio; first by differentiable Discrete wavelet transform (Discrete)Wavelet Transform, DWT) to convert the original carrier audio
Figure SMS_3
Transferring to frequency domain to obtain corresponding approximate coefficient
Figure SMS_4
And detail coefficient
Figure SMS_5
Namely:
Figure SMS_6
wherein the approximation coefficient
Figure SMS_7
And detail coefficient
Figure SMS_8
Is the original carrier audio
Figure SMS_9
Is one half, i.e.
Figure SMS_10
Inspired by traditional audio watermarking, watermarking information
Figure SMS_11
Embedding into original carrier audio
Figure SMS_12
In low frequencies, i.e. using approximation coefficients
Figure SMS_13
As carriers of watermark information while preserving detail coefficients
Figure SMS_14
For subsequent audio reconstruction;
the encoder is used for encoding watermark information
Figure SMS_15
Is embedded into
Figure SMS_16
As shown in fig. 2, the encoder En generates a residual R and marks it further to
Figure SMS_17
Thereby generating approximate coefficients containing the watermark
Figure SMS_18
Namely:
Figure SMS_19
wherein
Figure SMS_20
Is an intensity factor, set to 1 by default; en (.) denotes encoder processing.
In addition, to meet fidelity requirements, approximate coefficients containing watermarks are used
Figure SMS_21
As far as possible identical to the original
Figure SMS_22
Keeping the same introduces a fundamental loss in the encoder training
Figure SMS_23
By mean square error
Figure SMS_24
As
Figure SMS_25
Namely:
Figure SMS_26
wherein i represents an index number;
Figure SMS_27
represents the ith approximation coefficient;
Figure SMS_28
representing the ith watermarked approximation coefficient;
to further improve fidelity and minimize
Figure SMS_29
And
Figure SMS_30
the domain gap between them, an additional discriminator D is introduced for forming a countertrain with the encoder, to counter the loss
Figure SMS_31
For better embedding of watermark information by the encoder, making it indistinguishable by the discriminator
Figure SMS_32
And
Figure SMS_33
thereby minimizing
Figure SMS_34
And
Figure SMS_35
the domain gap between, namely:
Figure SMS_36
where D (.) represents discriminator processing.
Step 2, before the audio containing the watermark is input into a decoder, a distortion layer is inserted between the encoder and the decoder for enhancing the robustness of the audio copying process;
in this step, it is essential to make the distortion layer differentiable, which can prevent the gradient interruption in the end-to-end learning process, however, the dubbing process is a complicated non-differential process, so the inserted distortion layer is set to the differential audio re-recording operation DAR, including ambient reverberation, band-pass filtering and gaussian noise; in a specific implementation, in order to realize robustness to dubbing, a dubbing process is analyzed from the influence of sound propagation in the air and the processing of a microphone and a loudspeaker; based on the analysis, the present example elaborately models transcription distortion through several differential operations (ambient reverberation, band-pass filtering and gaussian noise) and uses these as distortion layers with the proposed framework;
DAR cannot be applied directly to approximate coefficients containing watermarks, since it is a process running in the time domain
Figure SMS_37
Therefore, inverse DWT (Inverse Discrete Wavelet Transform) is used to watermark the approximate coefficients
Figure SMS_38
And corresponding detail coefficients
Figure SMS_39
Transforming back watermarked audio
Figure SMS_40
Namely:
Figure SMS_41
wherein the ambient reverberation is specifically:
the impulse response is the reaction of the environment upon receipt of a brief input signal, describing the acoustic properties of the environment, in particular the spatial reverberation behavior, and reproduces the reverberation in the environment by convolution, collecting different basic impulse responses from different microphones, room environments and loudspeakers to form a set
Figure SMS_42
Given a target audio
Figure SMS_43
From the set
Figure SMS_44
In which a basic impulse response is randomly selected
Figure SMS_45
And by aggregation
Figure SMS_46
For target audio
Figure SMS_47
Performing convolution
Figure SMS_48
Operate to simulate the ambient reverberation ER (), namely:
Figure SMS_49
the band-pass filtering specifically comprises:
since the frequency band of human hearing is limited, the widely used normal frequency range is 500Hz to 2000 Hz, based on which the commonly used speaker does not play audio with too high or too low frequency band, while the microphone also processes the played audio, usually by cutting off the frequency band outside the normal range to reduce noise, i.e. a basic noise removal process, so that the audio with watermark is processed in order to simulate the distortion caused by the inherent characteristics of the speaker and microphone
Figure SMS_50
Using band-pass filtering
Figure SMS_51
Operation, given target Audio
Figure SMS_52
Is carried out as follows
Figure SMS_53
Figure SMS_54
wherein
Figure SMS_55
And
Figure SMS_56
respectively representing low-pass filtering and high-pass filtering;
Figure SMS_57
and
Figure SMS_58
to represent
Figure SMS_59
And
Figure SMS_60
a corresponding threshold value;
the gaussian noise is specifically:
in addition to the above two components, the random noise caused by uncertain factors in the dubbing process is simulated by introducing gaussian noise, which is an additive noise and widely used in the current automatic speech recognition scheme to enhance the robustness to the random environmental noise, specifically by directly superposing the gaussian noise
Figure SMS_61
At the target audio
Figure SMS_62
To implement additive Gaussian noise operation
Figure SMS_63
Namely:
Figure SMS_64
wherein ,
Figure SMS_65
representing gaussian noise;
Figure SMS_66
represents a mean of 0 and a variance of
Figure SMS_67
A gaussian distribution of (a).
And 3, inputting the audio containing the watermark after the distortion layer into a decoder, and extracting the watermark information by the decoder.
In this step, the processing procedure of the distortion layer DAR is as follows:
Figure SMS_68
for audio with watermark
Figure SMS_69
Finally, the audio containing the watermark after being subjected to the DAR processing of the distortion layer is obtained
Figure SMS_70
Namely:
Figure SMS_71
obtaining the same by Discrete Wavelet Transform (DWT)
Figure SMS_72
Corresponding approximation coefficient
Figure SMS_73
And detail coefficient
Figure SMS_74
And combining the approximation coefficients
Figure SMS_75
Input to a decoder De, the watermark being extracted by the decoder De
Figure SMS_76
Namely:
Figure SMS_77
Figure SMS_78
where De (.) denotes decoder processing.
In a specific implementation, watermark loss is further introduced
Figure SMS_79
I.e. watermark information
Figure SMS_80
And watermark extracted by decoder
Figure SMS_81
Mean Square Error loss MSE (Mean Square Error) between, i.e.:
Figure SMS_82
when using binary watermarking
Figure SMS_83
Rather than to
Figure SMS_84
This is more advantageous for model watermark embedding and extraction, in this case for watermarked audio
Figure SMS_85
Watermark extracted by decoder
Figure SMS_86
Should be as close as possible to-1 and 1; for audio without watermarks, watermarks extracted by the decoder
Figure SMS_87
The distribution should be close to 0, which helps the MSE based constraint work.
It is noted that those skilled in the art will recognize that the embodiments of the present invention described in detail herein are not limited to the embodiments of the present invention described in detail herein.
To illustrate the effects of the embodiments of the present invention, the following experiments are described in detail:
1) Fidelity test
First, the fidelity of the method described in the present application was compared to the existing baseline method, and as shown in table 1, the method achieved an SNR of 25.86, which is superior to the existing baseline method.
Table 1 quantitative comparison with baseline method
Index (I) Method for producing a composite material Base line 1 Base line 2
SNR(dB) 25.86 25.81 24.94
ACC(%) 99.18 77.09 56.0
2) Robustness testing for audio dubbing
The robustness of the audio re-recording was compared in this experiment and quantitative results are provided in table 2, with the method being significantly better than the baseline method (over 20% and 40%, respectively) at comparable fidelity. In addition to the default distance (5 cm), further control comparisons were made with the baseline method under different conditions. As shown in table 2, the method described in the embodiment of the present application performs better over a wide range of distances, and as the distance increases, the robustness to dubbing decreases accordingly, but is still acceptable (both above 90%).
TABLE 2 comparison of robustness of dubbing at different distances
Distance (cm) 5 20 50 100
Method for producing a composite material 99.18 98.55 93.40 92.68
Base line 1 77.09 82.64 74.76 66.02
3) Robustness testing for other common distortions
To compare robustness more fully, further evaluations were made at other common distortions in the digital transmission process, namely gaussian noise at different signal-to-noise ratios (20 dB, 30 dB, 40 dB, 50 dB), MP3 compression (64 kbps, 128 kbps), bandpass (1 kHz high pass, 4 kHz), resampling, clipping, amplitude modification, re-quantization and median filtering. As shown in table 3, the method employed in the present application is robust against all types of distortions.
TABLE 3 robustness to other common distortions, ACC Default/enhanced version
Figure SMS_88
The above experimental results show that: the method of the embodiment of the invention can automatically realize the embedding of the audio watermark and the robust extraction under various distortions, and can achieve higher extraction accuracy rate compared with the prior method.
In summary, after the watermark information is embedded in the Audio, the method according to the embodiment of the present invention can implement robust extraction on the watermark in a common Audio processing distortion scene, a watermark attack scene, and recording (AR) distortion, thereby implementing the purposes of divulgence and tracing to the source and copyright protection.
In addition, it is understood by those skilled in the art that all or part of the steps in the method for implementing the above embodiments may be implemented by using a program to instruct the relevant hardware to implement, and the corresponding program may be stored in a computer-readable storage medium, where the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims. The information disclosed in this background section is only for enhancement of understanding of the general background of the invention and should not be taken as an acknowledgement or any form of suggestion that this information forms the prior art already known to a person skilled in the art.

Claims (6)

1. A method for deep learning-based audio watermark embedding and extraction is characterized in that the method comprises the following steps:
step 1, embedding watermark information into carrier audio by using an encoder to obtain audio containing a watermark;
step 2, before the audio containing the watermark is input into a decoder, a distortion layer is inserted between the encoder and the decoder for enhancing the robustness of the audio copying process;
and 3, inputting the audio containing the watermark after the distortion layer into a decoder, and extracting the watermark information by the decoder.
2. The method for embedding and extracting audio watermark based on deep learning of claim 1, wherein in step 1, the method uses
Figure QLYQS_1
To express a length of
Figure QLYQS_2
The mono original carrier audio; first by differentiable discrete wavelet transform
Figure QLYQS_3
Audio of original carrier
Figure QLYQS_4
Transferring to frequency domain to obtain corresponding approximate coefficient
Figure QLYQS_5
And detail coefficient
Figure QLYQS_6
Namely:
Figure QLYQS_7
wherein the approximation coefficient
Figure QLYQS_8
And detail coefficient
Figure QLYQS_9
Is the original carrier audio
Figure QLYQS_10
Is one half, i.e.
Figure QLYQS_11
Watermarking information
Figure QLYQS_12
Embedding into original carrier audio
Figure QLYQS_13
In low frequencies, i.e. using approximation coefficients
Figure QLYQS_14
As carriers of watermark information while preserving detail coefficients
Figure QLYQS_15
For subsequent audio reconstruction;
the encoder is used for encoding watermark information
Figure QLYQS_16
Is embedded into
Figure QLYQS_17
In particular: the encoder generates a residual and marks it further to
Figure QLYQS_18
Thereby generating approximate coefficients containing the watermark
Figure QLYQS_19
Namely:
Figure QLYQS_20
wherein
Figure QLYQS_21
Is an intensity factor, set to 1 by default; en () denotes encoder processing.
3. The method for deep learning based audio watermark embedding extraction as claimed in claim 2, wherein, in step 1,
in order to approximate the coefficients with the watermark
Figure QLYQS_22
As far as possible identical to the original
Figure QLYQS_23
Keeping the same introduces a fundamental loss in the encoder training
Figure QLYQS_24
By mean square error
Figure QLYQS_25
As
Figure QLYQS_26
Namely:
Figure QLYQS_27
wherein i represents an index number;
Figure QLYQS_28
represents the ith approximation coefficient;
Figure QLYQS_29
representing the ith watermarked approximation coefficient;
to minimize
Figure QLYQS_30
And
Figure QLYQS_31
the field gap between them, an additional discriminator is introduced for forming a countertraining with the encoder, to counter the loss
Figure QLYQS_32
For better embedding of watermark information by the encoder, making it indistinguishable by the discriminator
Figure QLYQS_33
And
Figure QLYQS_34
thereby minimizing
Figure QLYQS_35
And
Figure QLYQS_36
the domain gap between, namely:
Figure QLYQS_37
where D (.) represents discriminator processing.
4. The method for deep learning based audio watermark embedding extraction as claimed in claim 3, wherein in step 2, the inserted distortion layer is a difference audio re-recording operation DAR comprising ambient reverberation, band-pass filtering and Gaussian noise;
DAR cannot be applied directly to approximate coefficients containing watermarks, since it is a process running in the time domain
Figure QLYQS_38
Therefore, the approximate coefficients of the watermark will be contained using the inverse DWT, i.e. IDWT
Figure QLYQS_39
And the corresponding detailsNumber of
Figure QLYQS_40
Transforming back watermarked audio
Figure QLYQS_41
Namely:
Figure QLYQS_42
wherein the ambient reverberation is specifically:
the impulse response reproduces reverberation in the environment by convolution, collecting different base impulse responses from different microphones, room environments and loudspeakers to form a set
Figure QLYQS_43
Given a target audio
Figure QLYQS_44
From the set
Figure QLYQS_45
In which a basic impulse response is randomly selected
Figure QLYQS_46
And by aggregation
Figure QLYQS_47
For target audio
Figure QLYQS_48
Performing convolution
Figure QLYQS_49
Operate to simulate the ambient reverberation ER (), namely:
Figure QLYQS_50
the band-pass filtering specifically comprises:
for audio containing watermarks, in order to simulate distortions caused by the inherent characteristics of the loudspeaker and microphone
Figure QLYQS_51
Using band-pass filtering
Figure QLYQS_52
Operation, given target Audio
Figure QLYQS_53
Is performed as follows
Figure QLYQS_54
Figure QLYQS_55
wherein
Figure QLYQS_56
And
Figure QLYQS_57
respectively representing low-pass filtering and high-pass filtering;
Figure QLYQS_58
and
Figure QLYQS_59
to represent
Figure QLYQS_60
And
Figure QLYQS_61
a corresponding threshold value;
the gaussian noise is specifically:
simulating random noise caused by uncertain factors in the copying process by introducing Gaussian noise, specifically by directly superposing Gaussian noiseNoise(s)
Figure QLYQS_62
At the target audio
Figure QLYQS_63
To implement additive Gaussian noise operation
Figure QLYQS_64
Namely:
Figure QLYQS_65
wherein ,
Figure QLYQS_66
representing gaussian noise;
Figure QLYQS_67
represents a mean of 0 and a variance of
Figure QLYQS_68
A gaussian distribution of (a).
5. The method for deep learning based audio watermark embedding and extraction according to claim 4, wherein in step 3, the processing procedure of the distortion layer DAR is as follows:
Figure QLYQS_69
for audio with watermark
Figure QLYQS_70
Finally, the audio containing the watermark after being subjected to the DAR processing of the distortion layer is obtained
Figure QLYQS_71
Namely:
Figure QLYQS_72
obtaining the same by discrete wavelet transform DWT
Figure QLYQS_73
Corresponding approximation coefficient
Figure QLYQS_74
And detail coefficient
Figure QLYQS_75
And combining the approximation coefficients
Figure QLYQS_76
Input to a decoder De, the watermark being extracted by the decoder De
Figure QLYQS_77
Namely:
Figure QLYQS_78
Figure QLYQS_79
where De (.) denotes decoder processing.
6. The method for deep learning based audio watermark embedding and extraction according to claim 5, wherein in step 3, watermark loss is further introduced
Figure QLYQS_80
I.e. watermark information
Figure QLYQS_81
And watermark extracted by decoder
Figure QLYQS_82
Mean square error loss MSE between, i.e.:
Figure QLYQS_83
when using binary watermarking
Figure QLYQS_84
Rather than to
Figure QLYQS_85
In this case, for watermarked audio
Figure QLYQS_86
Watermarks extracted by decoders
Figure QLYQS_87
Should be as close as possible to-1 and 1; for audio without watermarks, watermarks extracted by the decoder
Figure QLYQS_88
Should be close to 0.
CN202310056386.7A 2023-01-15 2023-01-15 Audio watermark embedding and extracting method based on deep learning Active CN115831131B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310056386.7A CN115831131B (en) 2023-01-15 2023-01-15 Audio watermark embedding and extracting method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310056386.7A CN115831131B (en) 2023-01-15 2023-01-15 Audio watermark embedding and extracting method based on deep learning

Publications (2)

Publication Number Publication Date
CN115831131A true CN115831131A (en) 2023-03-21
CN115831131B CN115831131B (en) 2023-06-16

Family

ID=85520711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310056386.7A Active CN115831131B (en) 2023-01-15 2023-01-15 Audio watermark embedding and extracting method based on deep learning

Country Status (1)

Country Link
CN (1) CN115831131B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19852805A1 (en) * 1998-11-15 2000-05-18 Florian M Koenig Telephone with improved speech understanding, several microphones, and special speech signal processing
WO2001054053A1 (en) * 2000-01-24 2001-07-26 Ecole Polytechnique Federale De Lausanne Transform domain allocation for multimedia watermarking
US20080027734A1 (en) * 2006-07-26 2008-01-31 Nec (China) Co. Ltd. Media program identification method and apparatus based on audio watermarking
CN101290772A (en) * 2008-03-27 2008-10-22 上海交通大学 Embedding and extracting method for audio zero water mark based on vector quantization of coefficient of mixed domain
CN102074238A (en) * 2010-12-13 2011-05-25 山东科技大学 Linear interference cancellation-based speech secrete communication method
CN102074237A (en) * 2010-11-30 2011-05-25 辽宁师范大学 Digital audio watermarking method based on invariant characteristic of histogram
JP2011197619A (en) * 2010-03-18 2011-10-06 Yasuo Sano Perception improving means for acoustic signal using electronic watermark
CN103221997A (en) * 2010-09-21 2013-07-24 弗兰霍菲尔运输应用研究公司 Watermark generator, watermark decoder, method for providing a watermarked signal based on discrete valued data and method for providing discrete valued data in dependence on a watermarked signal
CN108962267A (en) * 2018-07-09 2018-12-07 成都信息工程大学 A kind of encryption voice content authentication method based on Hash feature
CN111292756A (en) * 2020-01-19 2020-06-16 成都嗨翻屋科技有限公司 Compression-resistant audio silent watermark embedding and extracting method and system
CN113808557A (en) * 2020-06-12 2021-12-17 比亚迪股份有限公司 Vehicle-mounted audio processing system, method and device

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19852805A1 (en) * 1998-11-15 2000-05-18 Florian M Koenig Telephone with improved speech understanding, several microphones, and special speech signal processing
WO2001054053A1 (en) * 2000-01-24 2001-07-26 Ecole Polytechnique Federale De Lausanne Transform domain allocation for multimedia watermarking
US20080027734A1 (en) * 2006-07-26 2008-01-31 Nec (China) Co. Ltd. Media program identification method and apparatus based on audio watermarking
CN101290772A (en) * 2008-03-27 2008-10-22 上海交通大学 Embedding and extracting method for audio zero water mark based on vector quantization of coefficient of mixed domain
JP2011197619A (en) * 2010-03-18 2011-10-06 Yasuo Sano Perception improving means for acoustic signal using electronic watermark
CN103221997A (en) * 2010-09-21 2013-07-24 弗兰霍菲尔运输应用研究公司 Watermark generator, watermark decoder, method for providing a watermarked signal based on discrete valued data and method for providing discrete valued data in dependence on a watermarked signal
CN102074237A (en) * 2010-11-30 2011-05-25 辽宁师范大学 Digital audio watermarking method based on invariant characteristic of histogram
CN102074238A (en) * 2010-12-13 2011-05-25 山东科技大学 Linear interference cancellation-based speech secrete communication method
CN108962267A (en) * 2018-07-09 2018-12-07 成都信息工程大学 A kind of encryption voice content authentication method based on Hash feature
CN111292756A (en) * 2020-01-19 2020-06-16 成都嗨翻屋科技有限公司 Compression-resistant audio silent watermark embedding and extracting method and system
CN113808557A (en) * 2020-06-12 2021-12-17 比亚迪股份有限公司 Vehicle-mounted audio processing system, method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
徐甲甲;张卫明;蒋瑞祺;俞能海;胡校成;: "最优结构相似约束下的可逆信息隐藏算法研究" *
徐甲甲;张卫明;蒋瑞祺;俞能海;胡校成;: "最优结构相似约束下的可逆信息隐藏算法研究", 网络与信息安全学报 *

Also Published As

Publication number Publication date
CN115831131B (en) 2023-06-16

Similar Documents

Publication Publication Date Title
Hua et al. Twenty years of digital audio watermarking—a comprehensive review
JP3856652B2 (en) Hidden data embedding method and apparatus
Arnold et al. A phase-based audio watermarking system robust to acoustic path propagation
GB2431839A (en) Correlation in audio processing
Dhar et al. Advances in audio watermarking based on singular value decomposition
US20180144755A1 (en) Method and apparatus for inserting watermark to audio signal and detecting watermark from audio signal
Dhar et al. Digital watermarking scheme based on fast Fourier transformation for audio copyright protection
Xiang et al. Digital audio watermarking: fundamentals, techniques and challenges
Liu et al. Dear: A deep-learning-based audio re-recording resilient watermarking
US20080273707A1 (en) Audio Processing
Malik et al. Robust audio watermarking using frequency-selective spread spectrum
Hemis et al. Digital watermarking in audio for copyright protection
CN115831131B (en) Audio watermark embedding and extracting method based on deep learning
US20020184503A1 (en) Watermarking
Subhashini et al. Robust audio watermarking for monitoring and information embedding
Arnold et al. Quality evaluation of watermarked audio tracks
Lee et al. Audio watermarking through modification of tonal maskers
JP2006171110A (en) Method for embedding additional information to audio data, method for reading embedded additional information from audio data, and apparatus therefor
Shahriar et al. Time-domain audio watermarking using multiple marking spaces
Deshpande et al. A substitution-by-interpolation algorithm for watermarking audio
Li et al. A novel audio watermarking in wavelet domain
Patil et al. Audio watermarking: A way to copyright protection
Chore et al. Survey on different methods of digital audio watermarking
Cvejic et al. Audio watermarking: Requirements, algorithms, and benchmarking
Shokri et al. Audio-speech watermarking using a channel equalizer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant