CN115831131B - Audio watermark embedding and extracting method based on deep learning - Google Patents

Audio watermark embedding and extracting method based on deep learning Download PDF

Info

Publication number
CN115831131B
CN115831131B CN202310056386.7A CN202310056386A CN115831131B CN 115831131 B CN115831131 B CN 115831131B CN 202310056386 A CN202310056386 A CN 202310056386A CN 115831131 B CN115831131 B CN 115831131B
Authority
CN
China
Prior art keywords
audio
watermark
decoder
encoder
steps
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310056386.7A
Other languages
Chinese (zh)
Other versions
CN115831131A (en
Inventor
张卫明
刘畅
张�杰
方涵
马泽华
陈可江
俞能海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Science and Technology of China USTC
Original Assignee
University of Science and Technology of China USTC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Science and Technology of China USTC filed Critical University of Science and Technology of China USTC
Priority to CN202310056386.7A priority Critical patent/CN115831131B/en
Publication of CN115831131A publication Critical patent/CN115831131A/en
Application granted granted Critical
Publication of CN115831131B publication Critical patent/CN115831131B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention discloses a method for embedding and extracting an audio watermark based on deep learning, which comprises the steps of firstly embedding watermark information into carrier audio by using an encoder to obtain the audio containing the watermark; inserting a distortion layer between the encoder and the decoder for enhancing the robustness of the audio reproduction process before inputting the watermarked audio to the decoder; the audio containing the watermark after being subjected to the distortion layer is input to a decoder, and watermark information therein is extracted by the decoder. According to the method, after the watermark is embedded into the target audio, the watermark information in the target audio can be still extracted after the target audio is subjected to the distortion of noise adding, filtering, compression, resampling, re-quantization, transcription and the like, so that the purposes of audio leakage tracing and copyright protection are achieved.

Description

Audio watermark embedding and extracting method based on deep learning
Technical Field
The invention relates to the technical field of digital watermarking, in particular to a method for embedding and extracting an audio watermark based on deep learning.
Background
Digital watermarking has been widely studied for many years as an effective method of tracking leakage sources and copyright protection. Two of the most important properties that an audio watermark should meet are fidelity, which ensures proper use of the watermarked audio, and robustness, which ensures that the embedded watermark can be extracted without loss even if the audio is subject to distortion (MPEG encoding, noise addition, audio re-recording, etc.). Most conventional audio watermarking methods focus on the robustness of digital distortion in electronic channels, since most audio copying occurs in digital channels. However, with miniaturization of recording devices, audio Re-recording (AR) has become a more convenient and efficient way to copy Audio, when Audio is used as a carrier for information transmission, for many important confidential Audio information (litigation recording, evidence-taking Audio) and paid Audio piracy (network classroom Audio, film piracy), since the recording can effectively retain Audio content and significantly destroy embedded watermark signals, an attacker can easily and implicitly implement Audio content information stealing by using the means of recording, and has difficulty in leaving evidence, as shown in fig. 1 is a schematic diagram of the recording operation on leaking information in the prior art, so how to keep robustness in complex scenes is one of the greatest challenges for Audio watermarking, and ensuring robustness to the recording becomes the current stage of Audio watermarking.
At present, the research field of audio watermarking is mainly based on traditional mathematical algorithms, in an attempt to find the characteristics which are unchanged before and after distortion to embed the watermark, most of the used characteristics are in a transform domain, for example, the transform domain characteristics of audio are obtained by adopting audio frequency domain conversion methods such as Discrete Cosine Transform (DCT), discrete Wavelet Transform (DWT) and Fast Fourier Transform (FFT). However, because of the complexity of the transcription process itself, quantitatively and finely analyzing the distortion and finding robust and invariant features in this process is a very difficult task to achieve, and none of the prior art algorithms is well resistant to transcription distortion.
Disclosure of Invention
The invention aims to provide a method for embedding and extracting an audio watermark based on deep learning, which can be used for extracting watermark information in target audio after the target audio is subjected to noise adding, filtering, compressing, resampling, re-quantizing, copying and the like after the watermark is embedded into the target audio, so that the purposes of audio leakage tracing and copyright protection are realized.
The invention aims at realizing the following technical scheme:
a method of deep learning based audio watermark embedding extraction, the method comprising:
step 1, embedding watermark information into carrier audio by using an encoder to obtain audio containing watermark;
step 2, before inputting the audio containing the watermark into the decoder, inserting a distortion layer between the encoder and the decoder for enhancing the robustness of the audio reproduction process;
and 3, inputting the audio containing the watermark after being subjected to the distortion layer into a decoder, and extracting watermark information in the audio by the decoder.
According to the technical scheme provided by the invention, after the watermark is embedded into the target audio, the watermark information in the target audio can be still extracted after the target audio is subjected to the distortion of noise adding, filtering, compression, resampling, re-quantization, transcription and the like, so that the purposes of audio leakage tracing and copyright protection are realized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a dubbing operation for leaking information in the prior art.
Fig. 2 is a schematic flow chart of a method for audio watermark embedding and extraction based on deep learning according to an embodiment of the present invention.
Detailed Description
The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments of the present invention, and this is not limiting to the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
Embodiments of the present invention will be described in further detail below with reference to the accompanying drawings. What is not described in detail in the embodiments of the present invention belongs to the prior art known to those skilled in the art. The specific conditions are not noted in the examples of the present invention and are carried out according to the conditions conventional in the art or suggested by the manufacturer. The reagents or apparatus used in the examples of the present invention were conventional products commercially available without the manufacturer's knowledge.
Fig. 2 is a schematic flow chart of a method for audio watermark embedding and extraction based on deep learning according to an embodiment of the present invention, where the method includes:
step 1, embedding watermark information into carrier audio by using an encoder to obtain audio containing watermark;
in this step, in step 1, use is made of
Figure SMS_1
To represent a length of +.>
Figure SMS_2
Mono original carrier audio of (a); the original carrier audio is first of all +.>
Figure SMS_3
Transferring to frequency domain to obtain corresponding approximation coefficients +.>
Figure SMS_4
And detail coefficient->
Figure SMS_5
The method comprises the following steps:
Figure SMS_6
wherein the approximation coefficients
Figure SMS_7
And detail coefficient->
Figure SMS_8
Is the length of the original carrier audio +.>
Figure SMS_9
Half of (i.e.)>
Figure SMS_10
Inspired by the traditional audio watermarking, watermark information is obtained
Figure SMS_11
Embedded in the original carrier audio->
Figure SMS_12
In the low frequency of (2), i.e. using approximation coefficients +.>
Figure SMS_13
As carrier of watermark information while preserving detail coefficients +.>
Figure SMS_14
For subsequent audio reconstruction;
the encoder is used for watermark information
Figure SMS_15
Embedded in->
Figure SMS_16
In, as shown in fig. 2, the encoder En generates a residual R and marks it further to +.>
Figure SMS_17
Thereby generating approximation coefficients +.>
Figure SMS_18
The method comprises the following steps:
Figure SMS_19
wherein
Figure SMS_20
Is an intensity factor, default set to 1; en () represents encoder processing.
In addition, to meetFidelity requirements to approximate coefficients containing watermarks
Figure SMS_21
As much as possible->
Figure SMS_22
Keeping in agreement, a basic penalty is introduced in the training of the encoder>
Figure SMS_23
Using mean square error->
Figure SMS_24
As->
Figure SMS_25
The method comprises the following steps:
Figure SMS_26
wherein i represents an index number;
Figure SMS_27
representing the i-th approximation coefficient; />
Figure SMS_28
Representing the ith watermark-containing approximation coefficient;
to further improve fidelity and minimize
Figure SMS_29
and />
Figure SMS_30
The difference of the domains between the two is introduced with an additional discriminator D for creating a countermeasure training with the encoder against losses ∈ ->
Figure SMS_31
For better embedding of watermark information by the encoder, making the discriminator indistinguishable +.>
Figure SMS_32
and />
Figure SMS_33
Thereby minimizing +.>
Figure SMS_34
and />
Figure SMS_35
The domain gap between, namely:
Figure SMS_36
wherein D () represents the discriminator process.
Step 2, before inputting the audio containing the watermark into the decoder, inserting a distortion layer between the encoder and the decoder for enhancing the robustness of the audio reproduction process;
in this step, it is necessary to make the distortion layer tiny, which can prevent gradient interruption in the end-to-end learning process, whereas the transcription process is a complex non-differential process, so the inserted distortion layer is set as a differential audio re-recording operation DAR, including ambient reverberation, band-pass filtering, and gaussian noise; in a specific implementation, in order to realize robustness to the dubbing, the dubbing process is analyzed from the influence of sound propagation in the air and the processing of a microphone and a loudspeaker; from the analysis, this example models the transcription distortion finely by several differential operations (ambient reverberation, band-pass filtering, and gaussian noise) and uses these operations as a distortion layer with the proposed framework;
since DAR is a process running in the time domain, it cannot be directly applied to approximation coefficients containing watermarks
Figure SMS_37
The approximation coefficients containing the watermark are therefore scaled using inverse DWT, IDWT (Inverse Discrete Wavelet Transform)
Figure SMS_38
And the corresponding detail coefficients->
Figure SMS_39
Transform back to watermarked audio->
Figure SMS_40
The method comprises the following steps:
Figure SMS_41
the environmental reverberation specifically includes:
the impulse response is the response of the environment upon receipt of a brief input signal describing the acoustic properties of the environment, in particular the spatial reverberation behavior, the impulse response reproducing the reverberation in the environment by convolution, collecting different base impulse responses from different microphones, room environments and loudspeakers to form a set
Figure SMS_42
Given target audio +.>
Figure SMS_43
From the collection->
Figure SMS_44
A basic impulse response is selected randomly>
Figure SMS_45
And by means of the collection->
Figure SMS_46
Audio of the object->
Figure SMS_47
Convolving->
Figure SMS_48
Operate to simulate ambient reverberation ER (), i.e.:
Figure SMS_49
the band-pass filtering specifically comprises the following steps:
because of the limited frequency band of human hearing, the widely used normal frequency range is 500Hz to 2000 Hz, based on which the conventional speaker does not play audio in too high or too low a frequency band, while the microphone also processes the played audio, typically cuts off the frequency band outside the normal range, to reduce noise, i.e. a basic de-noising process, thus, to simulate distortion caused by the inherent characteristics of the speaker and microphone, the audio containing the watermark
Figure SMS_50
Applying frequency band-pass filtering
Figure SMS_51
Operation, given target Audio->
Figure SMS_52
Execute +.>
Figure SMS_53
Figure SMS_54
wherein
Figure SMS_55
and />
Figure SMS_56
Representing low-pass filtering and high-pass filtering, respectively; />
Figure SMS_57
and />
Figure SMS_58
Representation->
Figure SMS_59
and />
Figure SMS_60
A corresponding threshold;
the Gaussian noise is specifically:
in addition to the two components, random noise caused by uncertainty factors in the reproduction process is simulated by introducing Gaussian noise, which is an additive noise widely used in current automatic speech recognition schemes to enhance robustness to random environmental noise, in particular by directly superimposing Gaussian noise
Figure SMS_61
In the target audio +.>
Figure SMS_62
Upper implementation of Add Gaussian noise operation->
Figure SMS_63
The method comprises the following steps:
Figure SMS_64
wherein ,
Figure SMS_65
representing gaussian noise; />
Figure SMS_66
Mean value 0, variance ++>
Figure SMS_67
Is a gaussian distribution of (c).
And 3, inputting the audio containing the watermark after being subjected to the distortion layer into a decoder, and extracting watermark information in the audio by the decoder.
In this step, the processing procedure of the distortion layer DAR is as follows:
Figure SMS_68
for watermarked audio
Figure SMS_69
Ultimately obtaining a watermark-containing after being subjected to a distortion layer DAR processingAudio frequency
Figure SMS_70
The method comprises the following steps:
Figure SMS_71
obtaining the discrete wavelet transform DWT
Figure SMS_72
Corresponding approximation coefficients>
Figure SMS_73
And detail coefficient
Figure SMS_74
And approximate coefficient +.>
Figure SMS_75
Inputting into decoder De, extracting watermark from the decoder De>
Figure SMS_76
The method comprises the following steps:
Figure SMS_77
Figure SMS_78
where De (-) represents the decoder processing.
In specific implementation, watermark loss is further introduced
Figure SMS_79
I.e. watermark information->
Figure SMS_80
And watermark extracted by decoder>
Figure SMS_81
The mean square error loss MSE (Mean Square Error) between:
Figure SMS_82
when binary watermarking is employed
Figure SMS_83
Rather than +.>
Figure SMS_84
This is more advantageous for watermark embedding and extraction of the model, in which case the watermark-containing audio is +.>
Figure SMS_85
Watermark extracted by decoder>
Figure SMS_86
The distribution of (2) should be as close as possible to-1 and 1; for watermarking-free audio, the watermark extracted by the decoder is +.>
Figure SMS_87
The distribution should be close to 0, which facilitates constrained operation based on MSE.
It is noted that what is not described in detail in the embodiments of the present invention belongs to the prior art known to those skilled in the art.
In order to illustrate the effect of the scheme of the embodiment of the present invention, the following detailed description is made through experiments:
1) Fidelity testing
First, the fidelity of the method described herein is compared to the existing baseline method, as shown in table 1, which achieves an SNR of 25.86, which is superior to the existing baseline method.
Quantitative comparison of Table 1 with baseline method
Index (I) The method Baseline 1 Baseline 2
SNR(dB) 25.86 25.81 24.94
ACC(%) 99.18 77.09 56.0
2) Robustness test for audio reproduction
The robustness of audio re-recordings was compared in this experiment and quantitative results are provided in table 2, with significant fidelity over the baseline method (over 20% and 40%, respectively). In addition to the default distance (5 cm), a control comparison was further made with the baseline method under different conditions. As shown in table 2, the method of the embodiment of the present application performs better over a long distance, and as the distance increases, the robustness to dubbing decreases correspondingly, but is still acceptable (both are above 90%).
Table 2 robustness comparison of dubbing at different distances
Distance (cm) 5 20 50 100
The method 99.18 98.55 93.40 92.68
Baseline 1 77.09 82.64 74.76 66.02
3) Robustness testing against other common distortions
To more fully compare robustness, further evaluations were made under other common distortions in the digital transmission process, namely gaussian noise at different signal-to-noise ratios (20 dB, 30 dB, 40 dB, 50 dB), MP3 compression (64 kbps, 128 kbps), bandpass (1 kHz high pass, 4 kHz), resampling, clipping, amplitude modification, re-quantization and median filtering. As shown in table 3, the method employed in the present application is robust against all types of distortion.
Table 3 robustness to other common distortions, default/enhanced ACC
Figure SMS_88
The experimental results show that: the method of the embodiment of the invention can automatically realize the embedding of the audio watermark and the robust extraction under various distortions, and can achieve higher extraction accuracy than the existing method.
In summary, after watermark information is embedded in Audio, the method of the embodiment of the invention can realize robust extraction of the watermark under common Audio processing distortion scene, watermark attack scene and Audio Re-recording (AR) distortion, thereby achieving the purposes of disclosure tracing and copyright protection.
In addition, it will be understood by those skilled in the art that all or part of the steps in implementing the methods of the above embodiments may be implemented by a program to instruct related hardware, and the corresponding program may be stored in a computer readable storage medium, where the storage medium may be a read only memory, a magnetic disk or an optical disk, etc.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions easily contemplated by those skilled in the art within the scope of the present invention should be included in the scope of the present invention. Therefore, the protection scope of the present invention should be subject to the protection scope of the claims. The information disclosed in the background section herein is only for enhancement of understanding of the general background of the invention and is not to be taken as an admission or any form of suggestion that this information forms the prior art already known to those of ordinary skill in the art.

Claims (3)

1. A method for deep learning-based audio watermark embedding and extraction, the method comprising:
step 1, embedding watermark information into carrier audio by using an encoder to obtain audio containing watermark;
in step 1, using
Figure QLYQS_1
To represent a length of +.>
Figure QLYQS_2
Mono original carrier audio of (a); first by means of a differentiable discrete wavelet transform +.>
Figure QLYQS_3
Original carrier audio +.>
Figure QLYQS_4
Transferring to frequency domain to obtain corresponding approximation coefficients +.>
Figure QLYQS_5
And detail coefficient->
Figure QLYQS_6
The method comprises the following steps:
Figure QLYQS_7
wherein the approximation coefficients
Figure QLYQS_8
And detail coefficient->
Figure QLYQS_9
Is the length of the original carrier audio +.>
Figure QLYQS_10
Half of (i.e.)>
Figure QLYQS_11
Watermark information
Figure QLYQS_12
Embedded in the original carrier audio->
Figure QLYQS_13
In the low frequency of (2), i.e. using approximation coefficients +.>
Figure QLYQS_14
As carrier of watermark information while preserving detail coefficients +.>
Figure QLYQS_15
For subsequent audio reconstruction;
the encoder is used for watermark information
Figure QLYQS_16
Embedded in->
Figure QLYQS_17
Specifically: the encoder generates a residual and marks it further +.>
Figure QLYQS_18
Thereby generating approximation coefficients +.>
Figure QLYQS_19
The method comprises the following steps:
Figure QLYQS_20
wherein
Figure QLYQS_21
Is an intensity factor, default set to 1; en () represents encoder processing;
in order to make the approximation coefficient containing watermark
Figure QLYQS_22
As much as possible->
Figure QLYQS_23
Keeping in agreement, a basic penalty is introduced in the training of the encoder>
Figure QLYQS_24
Using mean square error->
Figure QLYQS_25
As->
Figure QLYQS_26
The method comprises the following steps:
Figure QLYQS_27
wherein i represents an index number;
Figure QLYQS_28
representing the i-th approximation coefficient; />
Figure QLYQS_29
Representing the ith watermark-containing approximation coefficient;
to minimize
Figure QLYQS_30
and />
Figure QLYQS_31
The difference of the domains is introduced with an additional discriminator for creating an countermeasure training with the encoder, a countermeasure loss ∈ ->
Figure QLYQS_32
For better embedding of watermark information by the encoder, making the discriminator indistinguishable +.>
Figure QLYQS_33
and />
Figure QLYQS_34
Thereby minimizing +.>
Figure QLYQS_35
and />
Figure QLYQS_36
The domain gap between, namely:
Figure QLYQS_37
wherein D () represents a discriminator process;
step 2, before inputting the audio containing the watermark into the decoder, inserting a distortion layer between the encoder and the decoder for enhancing the robustness of the audio reproduction process;
in step 2, the inserted distortion layer is a differential audio re-recording operation DAR, including ambient reverberation, band-pass filtering, and gaussian noise;
since DAR is a process running in the time domain, it cannot be directly applied to approximation coefficients containing watermarks
Figure QLYQS_38
Therefore, inverse DWT, i.e. IDWT will contain the approximation coefficients of the watermark +.>
Figure QLYQS_39
And the corresponding detail coefficients->
Figure QLYQS_40
Transforming back to watermarked audio
Figure QLYQS_41
The method comprises the following steps:
Figure QLYQS_42
the environmental reverberation specifically includes:
the impulse response reproduces reverberation in the environment by convolution, collecting different base impulse responses from different microphones, room environments and speakers to form a set
Figure QLYQS_43
Given target audio +.>
Figure QLYQS_44
From the collection->
Figure QLYQS_45
A basic impulse response is selected randomly>
Figure QLYQS_46
And by means of the collection->
Figure QLYQS_47
Audio of the object->
Figure QLYQS_48
Convolving->
Figure QLYQS_49
Operate to simulate ambient reverberation ER (), i.e.:
Figure QLYQS_50
the band-pass filtering specifically comprises the following steps:
to simulate distortion caused by inherent characteristics of speakers and microphones, audio with watermark is recorded
Figure QLYQS_51
Applying frequency band-pass filtering->
Figure QLYQS_52
Operation, given target Audio->
Figure QLYQS_53
Execute +.>
Figure QLYQS_54
Figure QLYQS_55
wherein
Figure QLYQS_56
and />
Figure QLYQS_57
Representing low-pass filtering and high-pass filtering, respectively; />
Figure QLYQS_58
and />
Figure QLYQS_59
Representation->
Figure QLYQS_60
and />
Figure QLYQS_61
A corresponding threshold;
the Gaussian noise is specifically:
simulating random noise caused by uncertainty factors during reproduction by introducing Gaussian noise, in particular by directly superimposing Gaussian noise
Figure QLYQS_62
In the target audio +.>
Figure QLYQS_63
Upper implementation of Add Gaussian noise operation->
Figure QLYQS_64
The method comprises the following steps:
Figure QLYQS_65
wherein ,
Figure QLYQS_66
representing gaussian noise; />
Figure QLYQS_67
Mean value 0, variance ++>
Figure QLYQS_68
Is a gaussian distribution of (c);
and 3, inputting the audio containing the watermark after being subjected to the distortion layer into a decoder, and extracting watermark information in the audio by the decoder.
2. The method for deep learning based audio watermark embedding extraction of claim 1, wherein in step 3, the distortion layer DAR is processed as follows:
Figure QLYQS_69
for watermarked audio
Figure QLYQS_70
Ultimately obtaining audio containing a watermark after being subjected to a distortion layer DAR processing
Figure QLYQS_71
The method comprises the following steps:
Figure QLYQS_72
obtaining the discrete wavelet transform DWT
Figure QLYQS_73
Corresponding approximation coefficients>
Figure QLYQS_74
And detail coefficient
Figure QLYQS_75
And approximate coefficient +.>
Figure QLYQS_76
Inputting into decoder De, extracting watermark from the decoder De>
Figure QLYQS_77
The method comprises the following steps:
Figure QLYQS_78
Figure QLYQS_79
where De (-) represents the decoder processing.
3. The method for deep learning based audio watermark embedding extraction as claimed in claim 2, wherein in step 3, watermark loss is further introduced
Figure QLYQS_80
I.e. watermark information->
Figure QLYQS_81
And watermark extracted by decoder>
Figure QLYQS_82
The mean square error between MSE, namely:
Figure QLYQS_83
when binary watermarking is employed
Figure QLYQS_84
Rather than +.>
Figure QLYQS_85
In this case, for watermarked audio
Figure QLYQS_86
Watermark extracted by decoder>
Figure QLYQS_87
The distribution of (2) should be as close as possible to-1 and 1; for watermarking-free audio, the watermark extracted by the decoder is +.>
Figure QLYQS_88
The distribution of (2) should be close to 0.
CN202310056386.7A 2023-01-15 2023-01-15 Audio watermark embedding and extracting method based on deep learning Active CN115831131B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310056386.7A CN115831131B (en) 2023-01-15 2023-01-15 Audio watermark embedding and extracting method based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310056386.7A CN115831131B (en) 2023-01-15 2023-01-15 Audio watermark embedding and extracting method based on deep learning

Publications (2)

Publication Number Publication Date
CN115831131A CN115831131A (en) 2023-03-21
CN115831131B true CN115831131B (en) 2023-06-16

Family

ID=85520711

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310056386.7A Active CN115831131B (en) 2023-01-15 2023-01-15 Audio watermark embedding and extracting method based on deep learning

Country Status (1)

Country Link
CN (1) CN115831131B (en)

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE19852805A1 (en) * 1998-11-15 2000-05-18 Florian M Koenig Telephone with improved speech understanding, several microphones, and special speech signal processing
AU2001231109A1 (en) * 2000-01-24 2001-07-31 Businger, Peter A. Transform domain allocation for multimedia watermarking
CN101115124B (en) * 2006-07-26 2012-04-18 日电(中国)有限公司 Method and apparatus for identifying media program based on audio watermark
CN101290772B (en) * 2008-03-27 2011-06-01 上海交通大学 Embedding and extracting method for audio zero water mark based on vector quantization of coefficient of mixed domain
JP2011197619A (en) * 2010-03-18 2011-10-06 Yasuo Sano Perception improving means for acoustic signal using electronic watermark
EP2431970A1 (en) * 2010-09-21 2012-03-21 Fraunhofer-Gesellschaft zur Förderung der Angewandten Forschung e.V. Watermark generator, watermark decoder, method for providing a watermarked signal based on discrete valued data and method for providing discrete valued data in dependence on a watermarked signal
CN102074237B (en) * 2010-11-30 2012-07-04 辽宁师范大学 Digital audio watermarking method based on invariant characteristic of histogram
CN102074238A (en) * 2010-12-13 2011-05-25 山东科技大学 Linear interference cancellation-based speech secrete communication method
CN108962267B (en) * 2018-07-09 2019-11-15 成都信息工程大学 A kind of encryption voice content authentication method based on Hash feature
CN111292756B (en) * 2020-01-19 2023-05-26 成都潜在人工智能科技有限公司 Compression-resistant audio silent watermark embedding and extracting method and system
CN113808557A (en) * 2020-06-12 2021-12-17 比亚迪股份有限公司 Vehicle-mounted audio processing system, method and device

Also Published As

Publication number Publication date
CN115831131A (en) 2023-03-21

Similar Documents

Publication Publication Date Title
Hua et al. Twenty years of digital audio watermarking—a comprehensive review
JP3856652B2 (en) Hidden data embedding method and apparatus
Kirovski et al. Blind pattern matching attack on watermarking systems
Dhar et al. A new audio watermarking system using discrete fourier transform for copyright protection
JP2002519916A (en) Apparatus and method for embedding information into analog signal using duplicate modulation
WO2002091376A1 (en) Generation and detection of a watermark robust against resampling of an audio signal
Dhar et al. Digital watermarking scheme based on fast Fourier transformation for audio copyright protection
US20180144755A1 (en) Method and apparatus for inserting watermark to audio signal and detecting watermark from audio signal
Cvejic et al. Robust audio watermarking in wavelet domain using frequency hopping and patchwork method
JP4504681B2 (en) Method and device for embedding auxiliary data in an information signal
Zhang et al. An audio digital watermarking algorithm transmitted via air channel in double DCT domain
Liu et al. Dear: A deep-learning-based audio re-recording resilient watermarking
GB2431838A (en) Audio processing
CN115831131B (en) Audio watermark embedding and extracting method based on deep learning
US20020184503A1 (en) Watermarking
KR20030016381A (en) Watermarking
Subhashini et al. Robust audio watermarking for monitoring and information embedding
WO2001026110A1 (en) Embedding and detecting watermarks in one-dimensional information signals
Arnold et al. Quality evaluation of watermarked audio tracks
Acevedo Audio watermarking: properties, techniques and evaluation
JP2000209097A (en) Signal processor, signal processing method, signal recorder, signal reproducing device and recording medium
Dhar et al. An efficient audio watermarking algorithm in frequency domain for copyright protection
CN114743555A (en) Method and device for realizing audio watermarking
Cai et al. A WAV Format Audio Digital Watermarking Algorithm Based on HAS
KR100821349B1 (en) Method for generating digital watermark and detecting digital watermark

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant