CN106409302B

CN106409302B - Audio-frequency water mark method and system based on insertion regional choice

Info

Publication number: CN106409302B
Application number: CN201610458412.9A
Authority: CN
Inventors: 陈怡�; 高戈; 张康; 吕冰; 刘影
Original assignee: Central China Normal University
Current assignee: Central China Normal University
Priority date: 2016-06-22
Filing date: 2016-06-22
Publication date: 2019-07-09
Anticipated expiration: 2036-06-22
Also published as: CN106409302A

Abstract

The invention provides an audio watermarking method and system based on embedding region selection. The embedding process includes reading an audio file, first judging whether each frame signal can be used as an embedding region, and then selecting an embedding frequency segment of the audio watermark; Discrete Fourier transform, generate a binary pseudo-random spread spectrum sequence, embed the watermark, and transform to the time domain; the detection process includes reading the audio file to be detected, judging whether each frame of signal can be used as an embedded area, and calculating the detection range The starting point and the end point of the frequency domain, the discrete Fourier transform is performed to generate a binary pseudo-random spread spectrum sequence, the sufficient statistics of the detection are calculated, and the detected watermark bits are obtained. The invention proposes to filter out the transient signal through the ratio of the maximum energy to the minimum energy in the frame to improve the accuracy of watermark detection, and to improve the robustness of the watermark by embedding the watermark in a frequency band that is perceived by human ears.

Description

Audio Watermarking Method and System Based on Embedding Region Selection

技术领域technical field

本发明涉及数字音频水印技术领域，尤其涉及基于嵌入区域选择的音频水印方法及系统。The present invention relates to the technical field of digital audio watermarking, in particular to an audio watermarking method and system based on embedded region selection.

背景技术Background technique

数字音频水印是向音频信号中添加某些数字信息以达到文件真伪鉴别、版权保护、信息隐藏等目的的信号处理操作。音频水印嵌入区域的选择技术是指在水印嵌入到音频信号之前，选择合适音频区域嵌入水印。传统音频水印技术，没有考虑到音频信号的特征，对整个音频文件都进行水印的嵌入，这样会导致1)音频信号幅度低的区域嵌入水印后，幅值超出了掩蔽阈值产生噪音，破坏了感知透明性；2)对于音频信号中出现变化剧烈的瞬态信号，该区域的音频信号的方差很大，嵌入水印后导致检测水印时的水印误码率很高；3)在频域嵌入水印，如果选择人耳感知不显著的区域嵌入水印，在经过信号处理或音频有损压缩后，水印将会丢失一部分，导致水印检测误码率高。Digital audio watermarking is a signal processing operation that adds some digital information to the audio signal to achieve the purpose of document authenticity identification, copyright protection, and information hiding. The selection technology of audio watermark embedding area refers to selecting a suitable audio area to embed the watermark before the watermark is embedded in the audio signal. The traditional audio watermarking technology does not take into account the characteristics of the audio signal, and embeds the watermark on the entire audio file, which will lead to 1) After the watermark is embedded in the low-amplitude area of the audio signal, the amplitude exceeds the masking threshold and generates noise, which destroys the perception. Transparency; 2) For the transient signal with drastic changes in the audio signal, the variance of the audio signal in this area is large, and the watermark error rate when detecting the watermark is very high after embedding the watermark; 3) Embedding the watermark in the frequency domain, If the watermark is embedded in an area that is not noticeable to the human ear, after signal processing or lossy audio compression, part of the watermark will be lost, resulting in a high bit error rate for watermark detection.

发明内容SUMMARY OF THE INVENTION

本发明的目的是提供选择区域嵌入的音频水印技术，使水印能够嵌入到合适的音频区域中去，避免出现不必要的噪声以及减少误码的发生。The purpose of the present invention is to provide an audio watermarking technology for embedding in a selected region, so that the watermark can be embedded in a suitable audio region, so as to avoid unnecessary noise and reduce the occurrence of bit errors.

为达到上述目的，本发明提供的技术方案提供一种基于嵌入区域选择的音频水印方法，包括嵌入过程和检测过程，In order to achieve the above purpose, the technical solution provided by the present invention provides an audio watermarking method based on embedding region selection, including an embedding process and a detection process,

所述嵌入过程包括以下步骤，The embedding process includes the following steps,

步骤A1，读取音频文件，得到采样率fs1和分帧后第n帧时域音频的信号x_n，帧长为N，Step A1, read the audio file, obtain the sampling rate fs1 and the signal x _n of the time domain audio of the nth frame after framing, the frame length is N,

先对每帧信号x_n进行是否能够作为嵌入区域的判断，First, judge whether the signal x _n of each frame can be used as an embedded area.

然后针对能够作为嵌入区域的各帧信号x_n，进行音频水印的嵌入频率段的选择，设根据人耳感知敏感的频率部分预设的嵌入的开始频率为FWMIN、结束频率为FWMAX，一帧的开始嵌入点freqmin1和嵌入结束点freqmax1求取如下，Then, for each frame signal x _n that can be used as an embedded area, the selection of the embedded frequency segment of the audio watermark is performed, and the preset starting frequency of the embedding according to the frequency part that is sensitive to human ear perception is FWMIN, and the ending frequency is FWMAX. The start embedding point freqmin1 and the embedding end point freqmax1 are calculated as follows,

freqmin1＝floor((FWMIN×2.0/fs1)×N)freqmin1=floor((FWMIN×2.0/fs1)×N)

freqmax1＝floor((FWMAX×2.0/fs1)×N)freqmax1=floor((FWMAX×2.0/fs1)×N)

其中，floor为向下取整函数；Among them, floor is the round-down function;

步骤A2，对能够嵌入水印的各帧信号x_n，进行离散傅立叶变换得到频域信号X_n；Step A2, perform discrete Fourier transform on each frame signal x _n capable of embedding the watermark to obtain the frequency domain signal X _n ;

步骤A3，利用密钥key作为随机数种子，生成长度为freqmax1-freqmin1+1的二进制伪随机的扩频序列u；Step A3, using the key key as a random number seed, generating a binary pseudo-random spread spectrum sequence u with a length of freqmax1-freqmin1+1;

步骤A4，根据扩频序列u、频域信号X_n和水印比特b，进行水印的嵌入，得到嵌入水印后的频域信号，计算如下，Step A4, according to the spread spectrum sequence u, the frequency domain signal X _n and the watermark bit b, the watermark is embedded to obtain the frequency domain signal after embedding the watermark, and the calculation is as follows:

|X′_n|＝|X_n|+bαu|X′ _n |=|X _n |+bαu

其中，α为常数，控制水印的嵌入强度，|X_n|和|X′_n|分别表示嵌入水印前的频域幅值和嵌入水印后的频域幅度，然后通过欧拉公式得到嵌入水印后的频域信号Among them, α is a constant, which controls the embedding strength of the watermark, and |X _n | and |X′ _n | represent the frequency domain amplitude before and after embedding the watermark, respectively. frequency domain signal

其中，∠X_n表示频域信号的相位，X′_n表示嵌入水印后的频域信号，e为数学自然指数；Among them, ∠X _n represents the phase of the frequency domain signal, X′ _n represents the frequency domain signal after embedding the watermark, and e is the mathematical natural exponent;

步骤A5，将嵌入水印后的频域信号X′_n变换到时域，生成嵌入水印的音频文件；Step A5, transform the frequency domain signal X' _n after embedding the watermark to the time domain, and generate the audio file embedded in the watermark;

所述检测过程包括以下步骤，The detection process includes the following steps:

步骤B1，读取待检测的音频文件，得到的时域分帧后的第n帧信号z_n和采样率fs2，Step B1, read the audio file to be detected, and obtain the nth frame signal z _n and the sampling rate fs2 after the time domain is divided into frames,

先对每帧信号x_n进行是否能够作为嵌入区域的判断；First, judge whether each frame of signal x _n can be used as an embedded area;

针对能够作为嵌入区域的各帧信号x_n，作为待检测的信号，计算检测范围的起始点freqmin2和频域结束点freqmax2For each frame signal x _n that can be used as the embedded area, as the signal to be detected, calculate the starting point freqmin2 of the detection range and the end point freqmax2 of the frequency domain

freqmin2＝floor((FWMIN×2.0/fs2)×N)freqmin2=floor((FWMIN×2.0/fs2)×N)

freqmax2＝floor((FWMAX×2.0/fs2)×N)freqmax2=floor((FWMAX×2.0/fs2)×N)

步骤B2，进行离散傅立叶变换得到待检测信号的频域信号Z_n，相应频域幅度值记为|Z_n|；Step B2, perform discrete Fourier transform to obtain the frequency domain signal Z _n of the signal to be detected, and the corresponding frequency domain amplitude value is denoted as |Z _n |;

步骤B3，利用密钥key作为随机数种子，生成长度为freqmax2-freqmin2+1的二进制伪随机的扩频序列u；Step B3, use the key key as the random number seed to generate a binary pseudo-random spread spectrum sequence u with a length of freqmax2-freqmin2+1;

步骤B4，根据扩频序列u和待检测信号的频域幅度值|Z_n|，计算出检测的充分统计量r_n如下，Step B4, according to the spread spectrum sequence u and the frequency domain amplitude value |Z _n | of the signal to be detected, the sufficient statistic r _n for detection is calculated as follows:

如果充分统计量r_n≥0，那么检测到的水印比特为b＝1；否则，检测到的水印比特为b＝0。If the sufficient statistic rn _≥ 0, the detected watermark bit is b=1; otherwise, the detected watermark bit is b=0.

而且，步骤A1和步骤B1中，对每帧信号x_n进行是否能够作为嵌入区域的判断，实现方式如下，Moreover, in step A1 and step B1, it is determined whether the signal x _n of each frame can be used as an embedded area, and the implementation is as follows:

1)信号x_n的平均能量的大小超出预设的相应阈值τ₁，是则为静音区，不允许嵌入水印；1) Average energy of signal x _n The size of τ exceeds the preset corresponding threshold τ ₁ , if it is, it is a silent zone, and watermarking is not allowed;

2)如果信号x_n内包含瞬态信号，则不允许嵌入水印。2) If the signal x _n contains a transient signal, it is not allowed to embed the watermark.

而且，信号x_n内是否包含瞬态信号，通过以下方式判断，Moreover, whether a transient signal is included in the signal x _n is determined by the following method:

设将一帧信号分解为S个块，分别计算出S个块的能量，比较最大能量的块与最小能量块的能量比rate和预设的相应阈值τ₂，如果rate大于τ₂则认为该帧信号包含瞬态信号。Suppose a frame of signal is decomposed into S blocks, and the energy of the S blocks is calculated respectively, and the energy ratio rate of the block with the maximum energy and the minimum energy block is compared with the preset corresponding threshold τ ₂ , if the rate is greater than τ ₂ , it is considered that the The frame signal contains transient signals.

本发明还相应提供一种基于嵌入区域选择的音频水印系统，包括音频水印嵌入子系统和水印检测子系统，The present invention also correspondingly provides an audio watermarking system based on embedding region selection, comprising an audio watermark embedding subsystem and a watermark detection subsystem,

所述音频水印嵌入子系统包括以下模块，The audio watermark embedding subsystem includes the following modules,

选择合适区域嵌入模块，用于读取音频文件，得到采样率fs1和分帧后第n帧时域音频的信号x_n，帧长为N，Select the appropriate region embedding module to read the audio file to obtain the sampling rate fs1 and the time domain audio signal x _n of the nth frame after framing, and the frame length is N,

然后针对能够作为嵌入区域的各帧信号x_n，进行音频水印的嵌入频率段的选择，进行音频水印的嵌入频率段的选择，设根据人耳感知敏感的频率部分预设的嵌入的开始频率为FWMIN、结束频率为FWMAX，一帧的开始嵌入点freqmin1和嵌入结束点freqmax1求取如下，Then, for each frame signal x _n that can be used as an embedded area, the selection of the embedded frequency segment of the audio watermark is performed, and the selection of the embedded frequency segment of the audio watermark is performed, and the preset start frequency of embedding according to the frequency part that is sensitive to human ear perception is set as FWMIN, the end frequency is FWMAX, the start embedding point freqmin1 and embedding end point freqmax1 of a frame are calculated as follows,

freqmin1＝floor((FWMIN×2.0/fs1)×N)freqmin1=floor((FWMIN×2.0/fs1)×N)

freqmax1＝floor((FWMAX×2.0/fs1)×N)freqmax1=floor((FWMAX×2.0/fs1)×N)

第一时频转换模块，用于对能够嵌入水印的各帧信号x_n，进行离散傅立叶变换得到频域信号X_n；The first time-frequency conversion module is used to perform discrete Fourier transform on each frame signal x _n that can be embedded in the watermark to obtain the frequency domain signal X _n ;

第一扩频序列生成模块，用于利用密钥key作为随机数种子，生成长度为freqmax1-freqmin1+1的二进制伪随机的扩频序列u；The first spread spectrum sequence generation module is used for using the key key as a random number seed to generate a binary pseudo-random spread spectrum sequence u with a length of freqmax1-freqmin1+1;

水印嵌入模块，用于根据扩频序列u、频域信号X_n和水印比特b，进行水印的嵌入，得到嵌入水印后的频域信号，计算如下，The watermark embedding module is used to embed the watermark according to the spread spectrum sequence u, the frequency domain signal _Xn and the watermark bit b, and obtain the frequency domain signal after embedding the watermark. The calculation is as follows:

|X′_n|＝|X_n|+bαu|X′ _n |=|X _n |+bαu

时频逆变换模块，用于将嵌入水印后的频域信号X′_n变换到时域，生成嵌入水印的音频文件；The time-frequency inverse transform module is used to transform the watermark-embedded frequency-domain signal X′ _n into the time-domain to generate a watermark-embedded audio file;

所述水印检测子系统包括以下模块，The watermark detection subsystem includes the following modules,

选择合适区域检测模块，用于读取待检测的音频文件，得到的时域分帧后的第n帧信号z_n和采样率fs2，Select an appropriate area detection module to read the audio file to be detected, and obtain the nth frame signal z _n and sampling rate fs2 after time domain framing,

freqmin2＝floor((FWMIN×2.0/fs2)×N)freqmin2=floor((FWMIN×2.0/fs2)×N)

freqmax2＝floor((FWMAX×2.0/fs2)×N)freqmax2=floor((FWMAX×2.0/fs2)×N)

第二时频转换模块，用于进行离散傅立叶变换得到待检测信号的频域信号Z_n，相应频域幅度值记为|Z_n|；The second time-frequency conversion module is used to perform discrete Fourier transform to obtain the frequency domain signal Z _n of the signal to be detected, and the corresponding frequency domain amplitude value is denoted as |Z _n |;

第二扩频序列生成模块，用于利用密钥key作为随机数种子，生成长度为freqmax2-freqmin2+1的二进制伪随机的扩频序列u；The second spread spectrum sequence generation module is used to use the key key as a random number seed to generate a binary pseudo-random spread spectrum sequence u with a length of freqmax2-freqmin2+1;

相关检测模块，用于根据扩频序列u和待检测信号的频域幅度值|Z_n|，计算出检测的充分统计量r_n如下，The correlation detection module is used to calculate the sufficient statistic r _n for detection according to the spread spectrum sequence u and the frequency domain amplitude value |Z _n | of the signal to be detected, as follows:

而且，选择合适区域嵌入模块和选择合适区域检测模块中，对每帧信号x_n进行是否能够作为嵌入区域的判断，实现方式如下，Moreover, in the selection of the appropriate region embedding module and the selection of the appropriate region detection module, the determination of whether each frame of signal x _n can be used as an embedded region is implemented as follows:

本发明提出了通过帧内最大能量与最小能量比来滤除瞬态信号提升水印检测的准确率，通过将水印嵌入在人耳感知显著的频段来提升水印的鲁棒性，进一步地，提出利用平均能量来滤除安静区域提升感知透明性。本发明技术方案具有重要的市场价值。The invention proposes to filter out the transient signal through the ratio of the maximum energy to the minimum energy in the frame to improve the accuracy of watermark detection, and to improve the robustness of the watermark by embedding the watermark in a frequency band that is significantly perceived by the human ear. Average energy to filter out quiet areas to improve perceived transparency. The technical solution of the present invention has important market value.

附图说明Description of drawings

图1是本发明实施例的嵌入子系统结构框图。FIG. 1 is a structural block diagram of an embedded subsystem according to an embodiment of the present invention.

图2是本发明实施例的检测子系统结构框图。FIG. 2 is a structural block diagram of a detection subsystem according to an embodiment of the present invention.

图3是本发明实施例的嵌入过程流程图FIG. 3 is a flowchart of an embedding process according to an embodiment of the present invention

图4是本发明实施例的检测过程流程图。FIG. 4 is a flowchart of a detection process according to an embodiment of the present invention.

具体实施方式Detailed ways

下面以具体实施例结合附图对本发明的技术方案作进一步说明。The technical solutions of the present invention will be further described below with specific embodiments in conjunction with the accompanying drawings.

本发明实施例提供一种基于嵌入区域选择的音频水印系统，包括音频水印嵌入子系统和水印检测子系统。Embodiments of the present invention provide an audio watermarking system based on embedding region selection, including an audio watermark embedding subsystem and a watermark detection subsystem.

参见图1，本发明实施例提供的嵌入区域选择的音频水印技术嵌入子系统，包括选择合适区域嵌入模块1、第一时频转换模块2、第一扩频序列生成模块3、水印嵌入模块4和时频逆变换模块5，具体实施时可以采用软件固化技术实现各模块。Referring to FIG. 1 , an audio watermarking technology embedding subsystem for embedding region selection provided by an embodiment of the present invention includes an embedding module 1 for selecting an appropriate region, a first time-frequency conversion module 2, a first spreading sequence generation module 3, and a watermark embedding module 4 and time-frequency inverse transformation module 5, each module can be realized by software curing technology during specific implementation.

所述选择合适区域嵌入模块1，对读取的时域音频信号帧进行判断，具体实施时可以逐帧判断是否能够满足嵌入水印的条件：不满足就跳过此帧，继续下一帧的判断；如果满足就将信号输出给第一时频变换模块2，根据读取到的时域音频信号的采样率和人耳较为敏感的频率范围计算此频域信号嵌入水印的范围，并将可嵌入范围内的频域信号输出给水印嵌入模块4，将该嵌入范围的最大值和最小值输出给第一扩频序列生成模块3；Described selecting a suitable area embedding module 1, judges the read time domain audio signal frame, and can judge whether the condition of embedding watermark can be met frame by frame during specific implementation: skip this frame if not met, and continue the judgment of the next frame If it is satisfied, the signal is output to the first time-frequency conversion module 2, and the range of the watermark embedded in this frequency-domain signal is calculated according to the read sampling rate of the time-domain audio signal and the frequency range to which the human ear is more sensitive, and the embedded watermark is The frequency domain signal within the range is output to the watermark embedding module 4, and the maximum value and the minimum value of the embedding range are output to the first spread spectrum sequence generation module 3;

所述第一时频转换模块2，用于将读取到的时域音频信号转换为频域信号，输出给水印嵌入模块4；The first time-frequency conversion module 2 is used to convert the read time-domain audio signal into a frequency-domain signal, and output to the watermark embedding module 4;

所述第一扩频序列生成模块3，用于根据随机数种子和选择合适区域嵌入模块1输入的嵌入范围的最大值和最小值生成与嵌入范围同长度的幅值为1或-1均匀分布的随机序列，并将此随机序列输出给水印嵌入模块4；The first spreading sequence generation module 3 is used to generate a uniform distribution with an amplitude of 1 or -1 having the same length as the embedding range according to the random number seed and the maximum and minimum values of the embedding range input by the selection of the appropriate region embedding module 1. and output the random sequence to the watermark embedding module 4;

所述水印嵌入模块4，对于频域信号中的幅度谱，生成频域的带有水印信息的音频信号输出给时频逆变换模块5；The watermark embedding module 4, for the amplitude spectrum in the frequency domain signal, generates an audio signal with watermark information in the frequency domain and outputs it to the time-frequency inverse transform module 5;

所述时频逆变换模块5，用于将水印嵌入模块4输入的频域的带有水印信息的音频信号转换为时域的带有水印信息的音频信号，并将此时域的带有水印信息的音频信号生成音频文件，就得到带有水印信息的音频文件。The time-frequency inverse transformation module 5 is used to convert the audio signal with watermark information in the frequency domain input by the watermark embedding module 4 into the audio signal with watermark information in the time domain, and convert the audio signal with watermark information in the time domain. The audio signal of the information generates an audio file, and an audio file with watermark information is obtained.

参见图2，本发明实施例提供的水印检测子系统，包括选择合适区域检测模块6、第二时频转换模块7、第二扩频序列生成模块8、相关检测模块9，具体实施时可以采用软件固化技术实现各模块。Referring to FIG. 2 , the watermark detection subsystem provided by the embodiment of the present invention includes a suitable region detection module 6 , a second time-frequency conversion module 7 , a second spread spectrum sequence generation module 8 , and a correlation detection module 9 , which can be used in specific implementation. Software curing technology realizes each module.

所述选择合适区域检测模块6与选择合适区域嵌入模块1的功能基本相同，不满足水印嵌入条件的区域，一般也不含有水印，检测时可以不用考虑：具体实施时可以逐帧判断，对于不满足检测条件的帧，跳过不检测，继续下一帧的判断；满足检测条件的音频信号输出给第二时频变换模块7，同样将频率检测区域的最大值与最小值输出给第二时频转换模块7和第二扩频序列生成模块8；The function of the selecting a suitable region detection module 6 is basically the same as that of the selecting a suitable region embedding module 1. The region that does not meet the watermark embedding conditions generally does not contain a watermark. For frames that meet the detection conditions, skip the non-detection and continue the judgment of the next frame; the audio signals that meet the detection conditions are output to the second time-frequency transformation module 7, and the maximum and minimum values of the frequency detection area are also output to the second time-frequency conversion module 7. a frequency conversion module 7 and a second spread spectrum sequence generation module 8;

所述第二时频转换模块7，用于将读取到的时域音频信号转换为频域信号，输出给相关检测模块9；The second time-frequency conversion module 7 is used to convert the read time-domain audio signal into a frequency-domain signal, and output to the correlation detection module 9;

所述第二扩频序列生成模块8与第一扩频序列生成模块3的功能基本相同，将产生的结果输出给相关检测模块9；The second spread spectrum sequence generation module 8 has basically the same function as the first spread spectrum sequence generation module 3, and outputs the generated result to the correlation detection module 9;

所述相关检测模块9，用于根据检测范围对输入的待检测的频域幅值信号和扩频序列生成模块9输入的扩频序列，计算相关值，根据相关值的符号，判断出水印。The correlation detection module 9 is used to calculate the correlation value of the input frequency domain amplitude signal to be detected and the spread spectrum sequence input by the spread spectrum sequence generation module 9 according to the detection range, and determine the watermark according to the sign of the correlation value.

各模块具体实现参见方法相应步骤，本发明不予赘述。本发明实施例提供的基于嵌入区域选择的音频水印方法，包括嵌入过程和检测过程。For the specific implementation of each module, refer to the corresponding steps of the method, which will not be repeated in the present invention. The audio watermarking method based on embedding region selection provided by the embodiment of the present invention includes an embedding process and a detection process.

参见图3，本发明实施例提供的基于选择区域的音频水印嵌入过程可以采用计算机软件技术手段自动进行流程，具体包括以下步骤：Referring to FIG. 3, the audio watermark embedding process based on the selection area provided by the embodiment of the present invention can be automatically performed by computer software technical means, and specifically includes the following steps:

步骤A1，读取音频文件，对时域的音频信号x先分帧，得到采样率fs1和分帧后的第n帧时域音频信号x_n(帧长为N)，对每帧信号x_n进行是否能够作为嵌入区域的判断，判断包含两方面的判断：Step A1, read the audio file, first divide the time domain audio signal x into frames, obtain the sampling rate fs1 and the nth frame time domain audio signal x _n (frame length is N) after the framing, and for each frame signal x _n To judge whether it can be used as an embedded area, the judgment includes two aspects:

1)判断x_n的平均能量的大小是否超出设定的阈值，来判断当前帧x_n是否为静音区，如果是静音区就不允许嵌入水印，否则超出阈值就不是静音区，可能进行嵌入。通过下面的公式计算第n帧的平均能量 1) Determine whether the average energy of x _n exceeds the set threshold to determine whether the current frame x _n is a silent area. If it is a silent area, it is not allowed to embed a watermark. Otherwise, if it exceeds the threshold, it is not a silent area and may be embedded. Calculate the average energy of the nth frame by the following formula

其中，N为帧长，即一帧内的样本点数；i为一帧内的样本点索引序号，取值在0到N-1之间；x_n ²(i)表示第n帧时域信号x_n在帧内第i点的能量；τ₁为平均能量的判决阈值，具体实施时本领域技术人员可自行预设取值，例如根据经验得到；如果超出阈值，则满足条件1)，进行下面条件2)的判断。Among them, N is the frame length, that is, the number of sample points in a frame; i is the index number of the sample points in a frame, ranging from 0 to N-1; x _n ² (i) represents the nth frame time domain signal The energy of x _n at the i-th point in the frame; τ ₁ is the judgment threshold of the average energy, and those skilled in the art can preset the value by themselves during the specific implementation, such as obtained from experience; if the threshold is exceeded, then condition 1) is satisfied, and the The judgment of the following condition 2).

2)对于一帧内出现瞬态信号的情况，由于其频率剧烈变化，会造成的较大的方差，在检测时信号方差越大造成的水印检测的错误概率越高，这种情况也不应该嵌入水印。通过将一帧分解为S个块，分别计算出S个块的能量，通过最大能量的块与最小能量块的能量比rate和阈值τ₂的比较，rate大于τ₂则认为是该帧信号包含瞬态信号不予嵌入水印，否则可以嵌入水印。具体实施时，本领域技术人员可自行预设S的取值。2) For the case of transient signals in a frame, due to the drastic change of its frequency, it will cause a large variance. The larger the signal variance during detection, the higher the error probability of watermark detection. This situation should not be Embed a watermark. By decomposing a frame into S blocks, the energies of the S blocks are calculated respectively. By comparing the energy ratio rate of the block with the maximum energy to the block with the minimum energy and the threshold τ ₂ , if the rate is greater than τ ₂ , it is considered that the frame signal contains Transient signals are not watermarked, otherwise watermarks can be embedded. During specific implementation, those skilled in the art can preset the value of S by themselves.

具体实现方式如下：The specific implementation is as follows:

首先将一帧信号x_n分成S个块，则每个子块内的样本点数M为First divide a frame of signal x _n into S blocks, then the number of sample points M in each sub-block is

M＝N/S (2)M=N/S (2)

每个块的能量E_i计算如下The energy E _i of each block is calculated as follows

其中，i表示帧内块的索引序号，j表示帧内样本点的索引序号，x_n ²(j)表示第n帧时域信号x_n在帧内第j点的能量。Among them, i represents the index number of the intra-frame block, j represents the index number of the sample point in the frame, and x _n ² (j) represents the energy of the n-th frame time domain signal x _n at the j-th point in the frame.

找出块能量中的最大能量E_Max和最小能量E_Min Find the maximum energy E _Max and the minimum energy E _Min in the block energy

E_Max＝MAX{E_i}，E_Min＝MIN{E_i}，i∈[0,S-1] (4)E _Max =MAX{E _i }, E _Min =MIN{E _i }, i∈[0,S-1] (4)

其中，MAX，MIN分别表示求最大值函数和最小值函数。Among them, MAX and MIN represent the maximum value function and the minimum value function, respectively.

最大能量和最小能量的比rate计算如下：The ratio of the maximum energy to the minimum energy rate is calculated as follows:

如果rate＞τ₂，就认为信号帧x_n内存在瞬态信号，该帧不嵌入水印；否则，可以嵌入水印。其中τ₂为阈值，具体实施时本领域技术人员可自行预设取值，例如τ₂为瞬态信号的检测阈值，根据经验得到。If rate>τ ₂ , it is considered that there is a transient signal in the signal frame x _n , and no watermark is embedded in this frame; otherwise, a watermark can be embedded. Wherein τ ₂ is a threshold value, and those skilled in the art can preset a value during specific implementation. For example, τ ₂ is a detection threshold value of a transient signal, which is obtained according to experience.

然后针对能够作为嵌入区域的各帧信号x_n，对于音频水印的嵌入频率段的选择，应为人耳感知较为显著的区域，本领域技术人员可根据人耳感知特性自行预先设定，例如1000-7000Hz。因为这些区域的信号在经过滤波、音频压缩等攻击后，不会被去除。所以将水印嵌入到感知明显的区域，在经受一些信号攻击后不会被抹掉，能够检测出来。设设根据人耳感知敏感的频率部分预设的嵌入的开始频率为FWMIN、结束频率为FWMAX，对应一帧的开始嵌入点freqmin1和嵌入结束点freqmax1求取如下，Then, for each frame signal x _n that can be used as an embedded area, the selection of the embedded frequency segment of the audio watermark should be an area where the human ear perceives more prominently. Those skilled in the art can preset it according to the human ear perception characteristics, for example, 1000- 7000Hz. Because the signals in these areas will not be removed after filtering, audio compression and other attacks. Therefore, if the watermark is embedded in a perceptually obvious area, it will not be erased after some signal attacks and can be detected. Assume that the preset start frequency of the embedding according to the frequency part that the human ear is sensitive to is FWMIN and the end frequency is FWMAX, and the corresponding start embedding point freqmin1 and embedding end point freqmax1 of a frame are calculated as follows:

freqmin1＝floor((FWMIN×2.0/fs1)×N) (6)freqmin1=floor((FWMIN×2.0/fs1)×N) (6)

freqmax1＝floor((FWMAX×2.0/fs1)×N) (7)freqmax1=floor((FWMAX×2.0/fs1)×N) (7)

其中，floor为向下取整函数。Among them, floor is the round-down function.

根据开始嵌入点freqmin1和嵌入结束点freqmax1，选取此范围内的频域音频信号。According to the start embedding point freqmin1 and the embedding end point freqmax1, the frequency domain audio signal within this range is selected.

具体实施时可以逐帧判断，不满足条件的跳过，进行下一帧的判断。During specific implementation, it can be judged frame by frame, and if the condition is not satisfied, the judgment of the next frame can be performed.

步骤A2，对能够嵌入水印的信号帧x_n，进行FFT变换(快速离散傅立叶变换)为频域信号X_n。Step A2, perform FFT transformation (Fast Discrete Fourier Transform) on the signal frame x _n capable of embedding the watermark into the frequency domain signal X _n .

步骤A3，利用密钥key作为随机数种子，生成长度为freqmax1-freqmin1+1的二进制伪随机扩频序列u。Step A3, using the key key as a random number seed, a binary pseudo-random spread spectrum sequence u with a length of freqmax1-freqmin1+1 is generated.

在MATLAB中的实施例具体过程如下：The specific process of the embodiment in MATLAB is as follows:

首先，利用密钥key，调用RandStream函数(随机种子函数)对rand函数(随机数生成函数)进行初始化，然后调用rand函数生成随机数，由于rand函数生成的随机数是0～1之间的数，还需对这些数进行四舍五入变成0和1的二进制伪随机序列，然后将此单极性的伪随机序列，转为双极性只含有+1和-1的伪随机序列u。First, use the key key to call the RandStream function (random seed function) to initialize the rand function (random number generation function), and then call the rand function to generate a random number, since the random number generated by the rand function is a number between 0 and 1 , it is necessary to round these numbers into a binary pseudo-random sequence of 0 and 1, and then convert this unipolar pseudo-random sequence into a bi-polar pseudo-random sequence u that only contains +1 and -1.

步骤A4，根据扩频序列u、频域信号X_n和水印比特b，利用下面的公式(8)进行水印的嵌入，得到嵌入水印后的频域信号，计算实现如下Step A4, according to the spread spectrum sequence u, the frequency domain signal X _n and the watermark bit b, use the following formula (8) to embed the watermark to obtain the frequency domain signal after embedding the watermark, and the calculation is implemented as follows:

|X′_n|＝|X_n|+bαu (8)|X′ _n |=|X _n |+bαu (8)

其中，α为常数，控制水印的嵌入强度，具体实施时本领域技术人员可预设取值；|X_n|和|X′_n|分别表示嵌入水印前的频域幅值和嵌入水印后的频域幅度，然后通过欧拉公式得到嵌入水印后的频域信号。Among them, α is a constant, which controls the embedding strength of the watermark, and can be preset by those skilled in the art during the specific implementation; |X _n | and |X' _n | represent the frequency domain amplitude before the watermark is embedded and the value after the watermark is embedded, respectively. Frequency domain amplitude, and then obtain the frequency domain signal after embedding the watermark through Euler's formula.

其中，∠X_n表示频域信号的相位，X′_n表示嵌入水印后的频域信号，e为数学自然指数。Among them, ∠X _n represents the phase of the frequency domain signal, X′ _n represents the frequency domain signal after embedding the watermark, and e is the mathematical natural exponent.

步骤A5，将嵌入水印后的频域信号X′_n变换到时域，最后生成音频文件，即得到嵌入水印的音频文件。Step A5, transform the watermark-embedded frequency-domain signal _X'n into the time-domain, and finally generate an audio file, that is, obtain a watermark-embedded audio file.

参见图4，本发明实施例提供的基于选择区域嵌入的音频水印检测过程，可以采用计算机软件技术手段自动进行流程，具体包括以下步骤：Referring to FIG. 4 , the audio watermark detection process based on selection area embedding provided by the embodiment of the present invention can be automatically performed by computer software technical means, and specifically includes the following steps:

步骤B1，读取待检测的音频文件，得到的时域分帧后的第n帧信号z_n和采样率fs2，对各时域信号z_n采取步骤A1中一样的判决方法，Step B1, read the audio file to be detected, obtain the n-th frame signal z _n and the sampling rate fs2 after the time domain is divided into frames, and adopt the same judgment method in step A1 for each time domain signal z _n ,

即考虑如下两个条件，That is, considering the following two conditions,

则不为静音区且不包含瞬态信号的帧信号，能够嵌入水印并有待检测。Then the frame signal that is not a silent zone and does not contain transient signals can be embedded with a watermark and needs to be detected.

针对能够作为嵌入区域的各帧信号x_n，作为待检测的信号，计算检测范围的频域起始点freqmin2和频域结束点freqmax2For each frame signal x _n that can be used as the embedded area, as the signal to be detected, calculate the frequency domain start point freqmin2 and the frequency domain end point freqmax2 of the detection range

freqmin2＝floor((FWMIN×2.0/fs2)×N) (10)freqmin2=floor((FWMIN×2.0/fs2)×N) (10)

freqmax2＝floor((FWMAX×2.0/fs2)×N) (11)freqmax2=floor((FWMAX×2.0/fs2)×N) (11)

步骤B2，对于满足检测条件的信号z_n，进行离散傅立叶变换得到待检测信号的频域信号Z_n，相应频域幅度值记为|Z_n|。Step B2, for the signal z _n that satisfies the detection condition, perform discrete Fourier transform to obtain the frequency domain signal Zn of the signal to be detected, and the corresponding frequency domain amplitude value _{is denoted as |Z n} _| .

步骤B3，利用密钥key，生成二进制扩频序列u(与上面嵌入方法得到的u方式相同)，即利用密钥key作为随机数种子，生成长度为freqmax2-freqmin2+1的二进制伪随机扩频序列u。Step B3, use the key key to generate a binary spread spectrum sequence u (same as the u obtained by the above embedding method), that is, use the key key as a random number seed to generate a binary pseudo-random spread spectrum with a length of freqmax2-freqmin2+1 sequence u.

步骤B4，根据扩频序列u和待检测信号的频域幅度值|Z_n|，通过计算扩频序列u和待检测信号的频域幅度值|Z_n|的相关值，计算出检测的充分统计量r_n Step B4, according to the spread spectrum sequence u and the frequency domain amplitude value |Z _n | of the signal to be detected, by calculating the correlation value between the spread spectrum sequence u and the frequency domain amplitude value |Z _n | statistic r _n

其中，<·>表示信号的内积计算。Among them, <·> represents the inner product calculation of the signal.

本发明中所描述的具体实施例仅仅是对本发明精神作举例说明。本发明所属技术领域的技术人员可以对所描述的具体实施例做各种各样的修改或补充或采用类似的方式替代，但并不会偏离本发明的精神或者超越所附权利要求书所定义的范围。The specific embodiments described in the present invention are merely illustrative of the spirit of the present invention. Those skilled in the art to which the present invention pertains can make various modifications or additions to the described specific embodiments or substitute in similar manners, but will not deviate from the spirit of the present invention or go beyond the definitions of the appended claims range.

Claims

1. a kind of audio-frequency water mark method based on insertion regional choice, it is characterised in that: including telescopiny and detection process, institute Telescopiny is stated to include the following steps,

Step A1 reads audio file, obtains the signal x of n-th frame time-domain audio after sample rate f s1 and framing_n, frame length N, first To every frame signal x_nBe made whether can as insertion region judgement,

Then being directed to can be as each frame signal x in insertion region_n, the selection of the insertion frequency band of audio frequency watermark is carried out, if according to The start frequency of the preset insertion of frequency-portions of auditory perceptual sensitivity is FWMIN, end frequency is FWMAX, the beginning of a frame Insertion point freqmin1 and be embedded in end point freqmax1 seek it is as follows,

Freqmin1=floor ((FWMIN × 2.0/fs1) × N)

Freqmax1=floor ((FWMAX × 2.0/fs1) × N)

Wherein, floor is downward bracket function；

Step A2, to each frame signal x that can be embedded in watermark_n, carry out Discrete Fourier Transform and obtain frequency domain signal X_n；

It is pseudo- to generate the binary system that length is freqmax1-freqmin1+1 using key key as random number seed by step A3 Random frequency expansion sequence u；

Step A4, according to frequency expansion sequence u, frequency domain signal X_nWith watermark bit b, the insertion of watermark is carried out, after obtaining insertion watermark Frequency-region signal, calculating is as follows,

|X′_n|=| X_n|+bαu

Wherein, α is constant, controls the embedment strength of watermark, | X_n| and | X '_n| respectively indicate insertion watermark before frequency domain amplitude and Then frequency domain amplitude after being embedded in watermark obtains the frequency-region signal after insertion watermark by Euler's formula

Wherein, ∠ X_nIndicate the phase of frequency-region signal, X '_nFrequency-region signal after indicating insertion watermark, e are mathematics natural Exponents, on Mark j is imaginary unit；

Step A5, by be embedded in watermark after frequency domain signal X '_nTime domain is transformed to, the audio file of insertion watermark is generated；

The detection process includes the following steps,

Step B1 reads audio file to be detected, the n-th frame signal z after obtained time domain framing_nIt is first right with sample rate f s2 Every frame signal x_nBeing made whether can be as the judgement in insertion region；

For each frame signal x that can be used as insertion region_n, as signal to be detected, calculate the starting point of detection range Freqmin2 and frequency domain end point freqmax2

Freqmin2=floor ((FWMIN × 2.0/fs2) × N)

Freqmax2=floor ((FWMAX × 2.0/fs2) × N)

Step B2 carries out Discrete Fourier Transform and obtains the frequency-region signal Z of signal to be detected_n, corresponding frequency domain range value is denoted as | Z_n |；

It is pseudo- to generate the binary system that length is freqmax2-freqmin2+1 using key key as random number seed by step B3 Random frequency expansion sequence u；

Step B4, according to the frequency domain range value of frequency expansion sequence u and signal to be detected | Z_n|, calculate the sufficient statistic r of detection_n It is as follows,

If sufficient statistic r_n>=0, then the watermark bit detected is b=1；Otherwise, the watermark bit detected is b= 0。

2. the audio-frequency water mark method according to claim 1 based on insertion regional choice, it is characterised in that: step A1 and step In B1, to every frame signal x_nBeing made whether can be as the judgement in insertion region, and implementation is as follows,

1) signal x_nAverage energySize exceed preset respective threshold τ₁, it is then not allow to be embedded in water for mute area Print；

The average energy of n-th frame is calculated by following formula

Wherein, N is frame length, and i is the sample point index number in a frame, x_n ²(i) n-th frame time-domain signal x is indicated_nI-th in frame The energy of point；τ₁For the decision threshold of average energy；

If 2) signal x_nInterior includes transient signal, then does not allow to be embedded in watermark.

3. the audio-frequency water mark method according to claim 2 based on insertion regional choice, it is characterised in that: signal x_nInside whether Comprising transient signal, it is judged by the following manner,

If a frame signal is decomposed into S block, the energy of S block is calculated separately out, compares the block and least energy of ceiling capacity The energy ratio rate of block and preset respective threshold τ₂If rate is greater than τ₂Then think that the frame signal includes transient signal.

4. it is a kind of based on insertion regional choice audio frequency watermark system, it is characterised in that: including audio frequency watermark insertion subsystem and Watermark detection subsystem,

The audio frequency watermark insertion subsystem comprises the following modules,

Selection appropriate area insertion module obtains n-th frame time-domain audio after sample rate f s1 and framing for reading audio file Signal x_n, frame length N,

First to every frame signal x_nBe made whether can as insertion region judgement,

Freqmin1=floor ((FWMIN × 2.0/fs1) × N)

Freqmax1=floor ((FWMAX × 2.0/fs1) × N)

Wherein, floor is downward bracket function；

First time-frequency convert module, for each frame signal x that can be embedded in watermark_n, carry out Discrete Fourier Transform and obtain frequency domain Signal X_n；

First frequency expansion sequence generation module, for using key key as random number seed, generation length to be freqmax1- The pseudorandom frequency expansion sequence u of the binary system of freqmin1+1；

Watermark embedding module, for according to frequency expansion sequence u, frequency domain signal X_nWith watermark bit b, the insertion of watermark is carried out, is obtained embedding Frequency-region signal after entering watermark, calculating is as follows,

|X′_n|=| X_n|+bαu

Time-frequency inverse transform module, for will be embedded in the frequency domain signal X after watermark '_nTime domain is transformed to, the audio of insertion watermark is generated File；

The watermark detection subsystem comprises the following modules,

Select appropriate area detection module, for reading audio file to be detected, the n-th frame signal after obtained time domain framing z_nWith sample rate f s2,

First to every frame signal x_nBeing made whether can be as the judgement in insertion region；

Freqmin2=floor ((FWMIN × 2.0/fs2) × N)

Freqmax2=floor ((FWMAX × 2.0/fs2) × N)

Second time-frequency convert module obtains the frequency-region signal Z of signal to be detected for carrying out Discrete Fourier Transform_n, corresponding frequency domain Range value is denoted as | Z_n|；

Second frequency expansion sequence generation module, for using key key as random number seed, generation length to be freqmax2- The pseudorandom frequency expansion sequence u of the binary system of freqmin2+1；

Coherent detection module, for the frequency domain range value according to frequency expansion sequence u and signal to be detected | Z_n|, calculate filling for detection Divide statistic r_nIt is as follows,

5. the audio frequency watermark system according to claim 4 based on insertion regional choice, it is characterised in that: selection appropriate area It is embedded in module and selection appropriate area detection module, to every frame signal x_nBe made whether can as insertion region judgement, Implementation is as follows,

The average energy of n-th frame is calculated by following formula

6. the audio frequency watermark system according to claim 5 based on insertion regional choice, it is characterised in that: signal x_nInside whether Comprising transient signal, it is judged by the following manner,