CN111445377A

CN111445377A - Audio processing method, device and storage medium

Info

Publication number: CN111445377A
Application number: CN202010226468.8A
Authority: CN
Inventors: 李纯
Original assignee: Shenzhen TCL Digital Technology Co Ltd
Current assignee: Shenzhen TCL Digital Technology Co Ltd
Priority date: 2020-03-26
Filing date: 2020-03-26
Publication date: 2020-07-24

Abstract

The invention provides an audio processing method, an audio processing device and a readable storage medium, wherein the method comprises the following steps: acquiring an original watermark image, and encrypting the original watermark image to obtain an encrypted watermark image; acquiring audio to be processed, and determining the embedded position information of the encrypted watermark image according to the information quantity of the encrypted watermark image and the audio to be processed; and embedding the encrypted watermark image into the audio to be processed according to the position information to obtain the watermark audio. The watermark image is added into the audio, which is equivalent to adding copyright information into the audio, so that the copyright of the audio is represented; in addition, because the watermark is embedded in an encrypted form, the difficulty of tampering is improved, and the integrity and the safety of the watermark (copyright information) are further ensured.

Description

Audio processing method, device and storage medium

Technical Field

The present invention relates to the field of data processing, and in particular, to an audio processing method, device, and storage medium.

Background

With the development and widespread use of multimedia technology and digital communication, various conventional multimedia works including audio works have begun to be spread out through the network as a medium to increase the speed of transmission and expand the range of transmission. However, the audio works based on network transmission are very easy to be illegally used, which results in that many original authors are unwilling to disclose their own audio works, thereby seriously limiting the transmission of the audio works; therefore, how to protect the copyright of the audio works is a problem to be solved urgently at present.

Disclosure of Invention

The invention mainly aims to provide an audio processing method, audio processing equipment and a storage medium, and aims to realize copyright protection of audio works.

To achieve the above object, an embodiment of the present invention provides an audio processing method, including:

acquiring an original watermark image, and encrypting the original watermark image to obtain an encrypted watermark image;

acquiring audio to be processed, and determining the embedded position information of the encrypted watermark image according to the information quantity of the encrypted watermark image and the audio to be processed;

and embedding the encrypted watermark image into the audio to be processed according to the position information to obtain the watermark audio.

Optionally, the step of obtaining the original watermark image and encrypting the original watermark image to obtain the encrypted watermark image includes:

acquiring an original watermark image, and performing signal decomposition on the original watermark image to acquire low-frequency and high-frequency information corresponding to the original watermark image;

encrypting the low-frequency and high-frequency information to obtain encrypted information;

and reconstructing the signal according to the encryption coefficient to obtain a corresponding encrypted watermark image.

Optionally, the low frequency and high frequency information comprises an approximation component and a detail component,

correspondingly, the step of encrypting the low-frequency and high-frequency information to obtain encrypted information includes:

determining the number of rows and columns of the approximate components, and determining the encrypted components corresponding to the approximate components according to the number of rows and columns and preset parameters;

and obtaining encryption information according to the encryption component and the detail component.

Optionally, the step of obtaining the audio to be processed and determining the position information of the encrypted watermark image according to the information amount of the encrypted watermark image and the audio to be processed includes:

acquiring audio to be processed, and segmenting the audio to be processed according to the information content of the encrypted watermark image and the audio to be processed to obtain a watermark carrier segment and a non-carrier segment;

determining the position information of the watermark carrier segment embedded by the encrypted watermark image according to the information quantity of the encrypted watermark image and the audio to be processed;

correspondingly, the step of embedding the encrypted watermark image into the audio to be processed according to the position information to obtain the watermark audio comprises:

and embedding the encrypted watermark image into the watermark carrier segment according to the position information to obtain an embedded segment, and combining the embedded segment and the non-carrier segment to obtain a watermark audio.

Optionally, the step of obtaining the audio to be processed and segmenting the audio to be processed according to the information amount of the encrypted watermark image and the audio to be processed to obtain a watermark carrier segment and a non-carrier segment includes:

acquiring audio to be processed, and determining the number of audio samples of the audio to be processed;

obtaining a one-dimensional watermark sequence according to the encrypted watermark image, and determining the number of watermark points in the watermark sequence;

determining the number of unit audio samples corresponding to each watermark point according to the number of the audio samples and the number of the watermark points;

determining the required audio sample number corresponding to the watermark sequence according to the unit audio sample number and the number of the watermark points;

and segmenting the audio to be processed according to the required audio sample number to obtain corresponding watermark carrier segments and non-carrier segments.

Optionally, the step of determining, according to the information amount of the encrypted watermark image and the audio to be processed, the position information of the watermark carrier segment in which the encrypted watermark image is embedded includes:

determining the corresponding position information of the watermark points in the watermark carrier segments according to the number of the unit audio samples;

correspondingly, the step of embedding the encrypted watermark image into the watermark carrier segment according to the position information to obtain an embedded segment includes:

and determining an embedding position in the watermark carrier segment according to the position information, and embedding each watermark point into the watermark carrier segment according to the embedding position to obtain an embedded segment.

Optionally, the step of determining an embedding position in the watermark carrier segment according to the position information, and embedding each watermark point into the watermark carrier segment according to the embedding position to obtain an embedded segment includes:

dividing the watermark carrier segment according to the number of the unit audio sample points to obtain original cells corresponding to the watermark points one by one, wherein the number of the original cells is equal to the number of the watermark points, and the number of the audio sample points included in each original cell is equal to the number of the unit audio sample points;

performing domain transformation on each original cellular to obtain domain transformation cellular corresponding to each original cellular;

determining an embedding position in each domain transformation cellular according to the position information, and respectively embedding each watermark point into the embedding position of the corresponding domain transformation cellular to obtain each embedded cellular;

carrying out inverse transformation of the domain on each embedded cellular cell to obtain inverse transformation cellular cells;

and obtaining the embedded segment according to each inverse transformation unit cell.

Optionally, the step of embedding the watermark points into the embedding positions of the corresponding domain transformation cells to obtain the embedded cells includes:

determining a first sequence of each watermark point in the watermark sequence, and determining a second sequence of each domain transformation cell according to the position of each original cell in the watermark carrier segment;

according to the first ordering and the second ordering, determining corresponding same-sequence domain transformation cells of each watermark point, wherein the first ordering of each watermark point is the same as the second ordering of the corresponding same-sequence domain transformation cells;

and embedding the watermark points into the embedding positions of the corresponding same-sequence domain transformation cells respectively to obtain the embedded cells.

Optionally, after the step of obtaining a one-dimensional watermark sequence according to the encrypted watermark image, the method further includes:

performing XOR transformation on the watermark sequence through a preset key sequence to obtain a transformed watermark sequence;

correspondingly, the step of determining an embedding position in each domain transformation unit cell according to the position information, and embedding each watermark point into the embedding position of the corresponding domain transformation unit cell respectively to obtain each embedded unit cell comprises:

determining a sampling point initial value of an embedding position in the domain transformation unit cell according to the position information;

and determining an embedded value of an embedded position according to the sampling point initial value and the value of the corresponding transformed watermark point in the transformed watermark sequence so as to obtain an embedded cellular according to the embedded value.

Optionally, the step of determining an embedded value of an embedding position according to the sampling point initial value and a value of a corresponding transformed watermark point in the transformed watermark sequence includes:

substituting the initial value of the sampling point and the value of the corresponding transformed watermark point into a preset embedding formula, and calculating to obtain an embedded value of the embedded position, wherein the preset embedding formula is as follows:

Coeff(i)(Index)＝(DCT(i)(Index))*(1+2Flag(i))

the Coeff (i), (index) is a sample point initial value of the embedding position of the ith domain transformation unit cell;

the DCT (i) (index) is an embedded value of the embedding position of the ith domain transform unit cell;

and the flag (i) is the value of the transformation watermark point corresponding to the ith domain transformation unit cell.

Optionally, before the step of performing xor transformation on the watermark sequence by using a preset key sequence to obtain a transformed watermark sequence, the method further includes:

determining the length of a key according to the length of the watermark sequence, and generating a corresponding preset key sequence according to the length of the key, wherein the length of the key is p times of the length of the watermark sequence, and p is a positive integer;

correspondingly, the step of performing xor transformation on the watermark sequence through the preset key sequence to obtain a transformed watermark sequence includes:

and based on the multiple relation between the key length and the length of the watermark sequence, performing spread spectrum XOR transformation on the watermark sequence through the preset key sequence to obtain a transformed watermark sequence.

Optionally, the step of performing spread spectrum xor transformation on the watermark sequence through the preset key sequence to obtain a transformed watermark sequence includes:

determining a third ordering of each watermark point in the watermark sequence and determining a fourth ordering of each key element in the preset key sequence;

according to the third ordering and the fourth ordering, carrying out xor operation on the t-th watermark point and the key elements of the (t-1) p +1, (t-1) p +2,. and. tp positions for p times in sequence to obtain p transformation watermark points of the (t-1) p +1, (t-1) p +2,. and. tp positions correspondingly, wherein t is 1, 2,. and L are the number of watermark points in the watermark sequence;

and arranging the transformed watermark points according to the sequence of the transformed watermark points to obtain a transformed watermark sequence.

Optionally, after the step of embedding the encrypted watermark into the audio segment corresponding to the audio to be processed to obtain the watermarked audio, the method further includes:

segmenting the watermark audio according to the number of the required audio samples to obtain the embedded segment and the non-carrier segment;

dividing the embedded segment according to the number of the unit audio samples to obtain the inverse transformation cells;

performing the domain transformation on the inverse transformation cells to obtain the embedded cells;

determining the embedding value of the embedding unit cell according to the position information;

calculating the embedded value and the corresponding initial value of the sampling point to obtain the value of the transformed watermark point;

combining the values of the transformed watermark points to obtain the transformed watermark sequence, and performing XOR transformation on the transformed watermark sequence through the preset key sequence to obtain the watermark sequence before transformation;

obtaining the encrypted watermark image according to the watermark sequence;

and decrypting the encrypted watermark image to obtain the original watermark image.

Furthermore, in order to achieve the above object, an embodiment of the present invention further provides an audio processing apparatus, which includes a processor, a memory, and a computer program stored on the memory and executed by the processor, wherein when the computer program is executed by the processor, the steps of the audio processing method as described above are implemented.

Furthermore, to achieve the above object, an embodiment of the present invention further provides a storage medium, which stores a computer program, wherein the computer program, when executed by a processor, implements the steps of the audio processing method as described above.

The embodiment of the invention firstly obtains the watermark image used for representing the copyright of the audio work, then encrypts the watermark image, and then embeds the encrypted watermark image into the audio to be processed to obtain the watermark audio; adding a watermark image into the audio, namely adding copyright information into the audio, so as to represent the copyright of the audio; in addition, because the watermark is embedded in an encrypted form, the difficulty of tampering is improved, and the integrity and the safety of the watermark (copyright information) are further ensured.

Drawings

Fig. 1 is a schematic diagram of a hardware architecture of an audio processing device according to an embodiment of the present invention;

FIG. 2 is a flowchart illustrating a first embodiment of an audio processing method according to the present invention;

FIG. 3 is a schematic diagram illustrating the encryption effect according to a first embodiment of the audio processing method of the present invention;

fig. 4 is a waveform diagram of a sound signal before and after watermark embedding according to a fifth embodiment of the audio processing method of the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration and are not intended to limit the invention.

The audio processing method related by the embodiment of the invention is mainly applied to audio processing equipment, and the audio processing equipment can be a server, a Personal Computer (PC), a notebook computer, a mobile phone and the like.

Referring to fig. 1, fig. 1 is a schematic diagram of a hardware architecture of an audio processing device according to an embodiment of the present invention. In this embodiment of the present invention, the audio Processing device includes a processor 1001 (e.g., a Central Processing Unit (CPU)), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005. The communication bus 1002 is used for realizing connection communication among the components; the user interface 1003 may include a Display screen (Display), an input unit such as a key (Keyboard); the network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WI-FI interface, WI-FI interface); the memory 1005 may be a Random Access Memory (RAM) or a non-volatile memory (non-volatile memory), such as a disk memory, and the memory 1005 may optionally be a storage device independent of the processor 1001. Of course, those skilled in the art will appreciate that the hardware configuration shown in FIG. 1 is not intended to limit the present invention.

With continued reference to FIG. 1, the memory 1005 of FIG. 1, which is one type of readable storage medium, may include an operating system, a network communication module, and a computer program. In fig. 1, the network communication module may be used to connect to the database for data interaction with the database; and the processor 1001 may call up a computer program stored in the memory 1005 and implement the audio processing method of the embodiment of the present invention.

The embodiment of the invention provides an audio processing method.

Referring to fig. 2, fig. 2 is a flowchart illustrating an audio processing method according to a first embodiment of the invention.

In this embodiment, the audio processing method includes the following steps:

and step S10, acquiring the original watermark image, and encrypting the original watermark image to obtain an encrypted watermark image.

With the development and widespread use of multimedia technology and digital communication, various conventional multimedia works including audio works have begun to be spread out through the network as a medium to increase the speed of transmission and expand the range of transmission. However, the audio works based on network transmission are very easy to be illegally used, which results in that many original authors are unwilling to disclose their own audio works, thereby seriously limiting the transmission of the audio works; therefore, how to protect the copyright of the audio works is a problem to be solved urgently at present. In contrast, in the present embodiment, an audio processing method is provided, where a watermark image used to represent the copyright of an audio work is obtained, and is encrypted, then an embedding position is determined according to the information amount of the watermark image and an audio to be processed, and then the encrypted watermark image is embedded into the embedding position of the audio to be processed, so as to obtain a watermark audio; adding a watermark image into the audio, namely adding copyright information into the audio, so as to represent the copyright of the audio; in addition, because the watermark is embedded in an encrypted form, the difficulty of tampering is improved, and the integrity and the safety of the watermark (copyright information) are further ensured.

The audio processing method of this embodiment may be applied to an audio processing device, where the audio processing device is an independent entity device, such as a server, a Personal Computer (PC), a notebook computer, a mobile phone, and the like, and the audio processing device performs related processing on a video to be processed and a watermark image. The audio processing method of this embodiment may also be applied to a device, which is an abstract functional device composed of a plurality of different entity functional modules. For convenience of explanation, the present embodiment is described by taking an example of processing performed by a PC applied to the PC.

In the embodiment, after the PC is started to operate, the original watermark file can be read firstly, so that the original watermark image is obtained; the original watermark file may be stored in a local storage space of the PC, or may be stored in a network, that is, the PC may obtain the original watermark image from the local storage space, or may obtain the original watermark image from the network. After the original watermark image is obtained, the PC encrypts the original watermark image, so that the tampering difficulty is improved, and the integrity and the safety of the watermark (copyright information) are further ensured.

Further, the step S10 includes:

step A11, acquiring an original watermark image, and performing signal decomposition on the original watermark image to acquire low-frequency and high-frequency information corresponding to the original watermark image;

when the original watermark image is encrypted, the embodiment may encrypt some characteristic parameters of the original watermark image. Specifically, when the original watermark image is obtained, in order to perform encryption, signal decomposition may be performed on the original watermark image, and low-frequency and high-frequency information of the original watermark image is obtained according to a signal decomposition result. Where signal decomposition is understood to mean the representation of the original watermark image by the superposition of one or more basis (or wave functions). For the image, the low-frequency and high-frequency information can reflect the characteristics of the image; the low-frequency information is mainly a comprehensive measurement of the intensity of the whole image, and describes an area with slow brightness or gray value change in the image, namely a large flat area in the image; while the high frequency components are mainly measures of image edges and contours, describing portions of the image where the changes are severe, i.e., edges (contours) or noise and detailed portions of the image. When the low-frequency and high-frequency information of the original watermark image is obtained, the low-frequency and high-frequency information can be encrypted to obtain encrypted information. It is worth noting that in signal decomposition, it is often necessary to determine the basis used for the decomposition, i.e. what is used as a measure for the signal decomposition, for example, the basis used for fourier transform is sine wave (sine function), and the basis used for binary discrete wavelet transform dwt2 is a specific wavelet; secondly, when different bases are adopted for signal decomposition, the obtained low-frequency and high-frequency information may have differences, that is, the characteristics of the image are observed in different ways, and the observed characteristics may have differences. In addition, when encrypting the low-frequency and high-frequency information, all the low-frequency and high-frequency information may be encrypted, or only a part of the information may be encrypted (for example, only the low-frequency information is encrypted), so that the encryption is realized and the encryption efficiency is improved.

In this embodiment, an original watermark image is subjected to signal decomposition by means of binary discrete wavelet transform dwt2, that is, a specific wavelet is used as a base to decompose the original watermark image, so as to obtain corresponding low-frequency and high-frequency information, where the low-frequency and high-frequency information includes an approximate component ca1, a horizontal detail component ch1, a vertical detail component cv1, and a diagonal detail component cd1, where the approximate component ca1 may be considered as low-frequency information obtained when decomposition is performed on the basis of the specific wavelet, and the horizontal detail component ch1, the vertical detail component cv1, and the diagonal detail component cd1 are high-frequency information obtained when decomposition is performed on the basis of the specific wavelet; here, the present embodiment mainly encrypts the approximate component ca1 (i.e., encrypts the low-frequency information), and thus the horizontal detail component ch1, the vertical detail component cv1, and the diagonal detail component cd1 may be collectively referred to as detail components. For example, when the a11 step is implemented by matlab, it may be:

Original_WaterImg＝imread('logo.jpg')；

[ca1,ch1,cv1,cd1]＝dwt2(im1,'bior3.7')；

for the above implementation, logo.jpg is the original watermark image, and bior3.7 is the wavelet; it should be noted that the above example specifies the use of a wavelet of bior3.7 when performing wavelet transform, and that other types of wavelets may be actually specified in a specific application.

Step A12, encrypting the low-frequency and high-frequency information to obtain encrypted information;

when the low-frequency and high-frequency information is obtained, the low-frequency and high-frequency information can be encrypted to obtain encrypted information. In this embodiment, mainly the low-frequency information is encrypted, specifically, when the approximate component ca1 is obtained, the PC may perform chaotic encryption on the approximate component ca1 to obtain an encrypted component fca 1; for the chaotic encryption process, the chaotic encryption can be realized by some existing functions, such as the hundungen function in matlab. The principle of the chaotic encryption is specifically as follows: determining the number of rows and columns (M rows × N columns) of the approximate component ca1, then converting the number of rows and columns of ca1 based on a preset huntungen algorithm and preset parameters to obtain an intermediate parameter e, and then performing related remainder operation through the intermediate parameter e and the preset parameters to obtain an encrypted component fca 1. For example, when the a12 step is implemented by matlab, it may be:

[M,N]＝size(ca1)；

e＝hundungen(M,N,0.1)；

coefficient＝0.1；

fca1＝mod(coefficient*ca1+(1-coefficient)*e,256)；

for the implementation above, e is the intermediate parameter obtained by presetting the huntungen algorithm, huntungen is the library function of matlab, 0.1 and 256 are self-defined parameters, and fca1 is the encrypted component. When the encrypted component is obtained, the encrypted component and the unprocessed detail components (including the horizontal detail component ch1, the vertical detail component cv1, and the diagonal detail component cd1) may be combined to obtain the entire encrypted information (fca1, ch1, cv1, cd 1).

And A13, performing signal reconstruction according to the encryption information to obtain a corresponding encrypted watermark image.

When the encrypted information is obtained, signal reconstruction can be carried out according to the encrypted information to obtain a corresponding encrypted watermark image; it should be noted that the reconstruction method used for signal reconstruction and the decomposition method used for signal decomposition are inverse transformations of each other, that is, the signal reconstruction is to newly construct a return image based on the basis of the superimposed basis weight. Specifically, in this embodiment, when obtaining the whole encrypted information, the binary discrete wavelet inverse transformation idwt2 is performed on the encrypted information to obtain a corresponding encrypted watermark image; of course, when performing the inverse transform, it is necessary to perform the inverse transform using the same wavelet as the a11 step. For example, when the a13 step is implemented by matlab, it may be:

jiami_Img＝idwt2(fca1,ch1,cv1,cd1,'bior3.7')；

for the above implementation, jiami _ Img is an encrypted watermark image.

As shown in fig. 3, fig. 3 is a schematic diagram of the encryption effect in this embodiment. The encryption mode is simple to use and high in efficiency, different encryption results can be obtained only by changing encryption parameters, and the confidentiality is good. Of course, other encryption methods may be used in practice, such as changing the type of wavelet, parameters of the complementation operation, etc.

Further, for the encrypted watermark image described in this embodiment, the encrypted watermark image may be decrypted in an inverse operation manner, so as to obtain an original watermark image. Specifically, the encrypted watermark image may be obtained first, and then binary discrete wavelet transform is performed on the encrypted watermark image to obtain corresponding low-frequency and high-frequency information, where the low-frequency and high-frequency information is the encrypted information in the encryption process, and includes encrypted components; then, the encrypted components can be subjected to chaotic inverse mapping (namely chaotic decryption) according to the same parameters, so that approximate components are obtained; the approximate component and other components form a low-frequency high-frequency coefficient of the original watermark image, and the low-frequency high-frequency coefficient is subjected to binary discrete wavelet inverse transformation to obtain the original watermark image. For example, when implemented by matlab, it may be:

jiami_Img＝imread('jiami_Img.jpg')；

[fca1,ch1,cv1,cd1]＝dwt2(WaterImage,'bior3.7')；

[M,N]＝size(fca1)；

e＝hundungen(M,N,0.1)；

coefficient＝0.1；

ca1＝(fcal-(1-coefficient)*e)/coefficient；

Original_WaterImg＝idwt2(fca1,ch1,cv1,cd1,'bior3.7')；

for the implementation, Original _ WaterImg is an Original watermark image obtained by decryption; the type of wavelet used and the parameters of the function in the decryption process should be consistent with the wavelet type and the parameters of the function in the encryption process.

Step S20, acquiring the audio to be processed, and determining the position information of the encrypted watermark image according to the information quantity of the encrypted watermark image and the audio to be processed;

in this embodiment, the PC further reads the audio file to be processed, thereby acquiring the audio to be processed; similarly, the audio file to be processed may be stored in a local storage space of the PC, or may be stored in the network, that is, the PC may obtain the audio to be processed from the local storage space, or may obtain the audio to be processed from the network. After the audio to be processed is obtained, the PC determines the position information of the embedded encrypted watermark image, namely determines the position of the audio to be processed, where the encrypted watermark image is to be embedded. In the embodiment, the embedded position information of the encrypted watermark image is determined according to the information quantity of the encrypted watermark image and the audio to be processed, and the position information is likely to change as long as the information quantity of one of the encrypted watermark image and the audio to be processed is changed, so that the situation that all watermarks are easy to break due to the fact that all watermarks are embedded in the same place is avoided, and the watermark embedding safety is improved.

And step S30, embedding the encrypted watermark image into the audio to be processed according to the position information to obtain a watermark audio.

In this embodiment, after determining the embedded position information, the PC may embed the encrypted watermark image into the audio to be processed according to the position information, which is equivalent to adding copyright information to the audio to represent the copyright of the audio.

The embodiment acquires an original watermark image, and encrypts the original watermark image to obtain an encrypted watermark image; acquiring audio to be processed, and determining the embedded position information of the encrypted watermark image according to the information quantity of the encrypted watermark image and the audio to be processed; and embedding the encrypted watermark image into the audio to be processed according to the position information to obtain the watermark audio. Through the above manner, in this embodiment, the watermark image used for representing the copyright of the audio work is obtained first, then encrypted, and then the encrypted watermark image is embedded into the audio to be processed to obtain the watermark audio; adding a watermark image into the audio, namely adding copyright information into the audio, so as to represent the copyright of the audio; in addition, because the watermark is embedded in an encrypted form, the difficulty of tampering is improved, and the integrity and the safety of the watermark (copyright information) are further ensured.

Based on the above first embodiment of the audio processing method, a second embodiment of the audio processing method of the present invention is provided.

In this embodiment, the step S20 includes:

step A21, acquiring a to-be-processed audio, and segmenting the to-be-processed audio according to the information content of the encrypted watermark image and the to-be-processed audio to obtain a watermark carrier segment and a non-carrier segment;

in this embodiment, after acquiring the audio to be processed, the PC segments the audio to be processed according to the information amount segment of the encrypted watermark image and the audio to be processed, so as to distinguish the audio segment in which the encrypted watermark image is embedded from the audio to be processed; for convenience of expression, the audio segment in which the encrypted watermark image is embedded may be referred to as a watermark carrier segment, and the remaining audio segments may be referred to as non-carrier segments. It should be noted that the segmentation in this embodiment is to divide a whole segment of audio to be processed into a first half watermark carrier segment and a second half non-carrier segment, and in practice, the whole audio segment may also be used as the watermark carrier segment.

Step A22, determining the position information of the watermark carrier segment embedded by the encrypted watermark image according to the information quantity of the encrypted watermark image and the audio to be processed;

after obtaining the watermark carrier segment, the PC determines the position information of the embedded encrypted watermark image, i.e. the position of the watermark carrier segment where the encrypted watermark image is to be embedded. In the embodiment, the embedded position information of the encrypted watermark image is determined according to the information quantity of the encrypted watermark image and the audio to be processed, so that the situation that all watermarks are easy to break due to the fact that all watermarks are embedded in the same place is avoided, and the watermark embedding safety is improved.

Accordingly, the step S30 includes:

step A31, the encrypted watermark image is embedded into the watermark carrier segment according to the position information to obtain an embedded segment, and the embedded segment and the non-carrier segment are combined to obtain the watermark audio.

In this embodiment, after the segmentation and the position information are obtained, the PC performs targeted processing on an audio segment (watermark carrier segment) corresponding to the audio to be processed according to the position information, that is, the encrypted watermark image is embedded into a certain position of the watermark carrier segment according to the position information, so as to obtain an embedded watermark carrier segment, and for convenience of description, the embedded watermark carrier segment is subsequently referred to as an "embedded segment"; whereas for non-vector fragments, they may remain unchanged during the embedding process. The embedded segment is then combined with the non-carrier segment to obtain the watermarked audio. It should be noted that the segmentation in this embodiment is to divide a whole segment of audio to be processed into a watermark carrier segment in the first half and a non-carrier segment in the second half, so that when merging, the embedded segment is used as the front part, and the non-carrier segment is used as the rear part to be spliced, so as to obtain the watermark audio.

By the above mode, the audio to be processed is segmented, and then the encrypted watermark image is embedded in a part of audio segments, so that the data processing amount is reduced, the change of the audio to be processed can be reduced, and the original characteristics of the audio to be processed are kept as much as possible.

Based on the above second embodiment of the audio processing method, a third embodiment of the audio processing method of the present invention is proposed.

In this embodiment, the step a21 includes:

step A211, acquiring audio to be processed, and determining the number of audio samples of the audio to be processed;

in this embodiment, the information amount of the audio to be processed may be represented by the number of audio samples of the audio to be processed. For digital audio stored and propagated on a computer or a network, an analog signal is described by a digital signal (or a digital signal), specifically, points are taken once every a period of time on the original analog signal waveform, each point is given a numerical value, the process is sampling, the points can be called audio sample points, then all the audio sample points are connected to describe the analog signal, the connected audio sample points can form audio to be processed, and the number of the audio sample points is the number of the audio sample points of the audio to be processed. After the PC obtains the audio to be processed, the number of audio samples included in the audio to be processed can be determined. For example, when implemented by matlab, it may be:

[Wav,Fs]＝audioread(‘originalWav.wav’)；

[WavLen,channel]＝size(Wav)；

wav is the audio to be processed, Wav L en is the number of samples, and channel is the channel for the above implementation.

Step A212, obtaining a one-dimensional watermark sequence according to the encrypted watermark image, and determining the number of watermark points in the watermark sequence;

in this embodiment, the information amount of the encrypted watermark image can be represented by the number of pixel points of the image; because the pixel points of the image are often ordered in a multi-dimensional matrix mode, and the audio sampling points are arranged in a series, in order to calculate the pixel point quantity of the image and perform subsequent embedding processing, the PC performs binarization and dimension reduction processing on the encrypted watermark image, so that the multi-dimensional encrypted watermark image is converted into a one-dimensional watermark sequence, and then performs subsequent processing on the one-dimensional watermark sequence. For example, when implemented by matlab, it may be:

waterImg＝imread('originallogo.jpg')；

BW＝im2bw(waterImg)；

[row,column]＝size(BW)；

Length＝row*column；

OneDimensional＝reshape(BW,1,Length)；

in the above implementation, orignallogo jpg is a watermark file, and OneDimensional is a one-dimensional watermark sequence.

When a one-dimensional watermark sequence is obtained, the PC can determine the number of watermark points in the watermark sequence.

Step A213, determining the number of unit audio samples corresponding to each watermark point according to the number of the audio samples and the number of the watermark points;

in this embodiment, after the PC obtains the number of audio samples of the audio to be processed and the number of watermark points in the watermark sequence, the number of watermark points is divided by the number of audio samples, so as to obtain the number of unit audio samples corresponding to each watermark point. It should be noted that the unit audio sample number refers to an audio sample number corresponding to a watermark point, which is an average number, the unit audio sample number of each watermark point is the same, and the number of unit audio samples, which may be used to characterize the embedding space for the watermark points, e.g., 3, it means that one watermark point corresponds to 3 audio samples, and when embedding is performed, the continuous 3 audio samples are taken as an embedding space of one watermark point, the watermark point is embedded in the embedding space, and of course, the watermark point and the three audio samples are not required to be embedded, it may be that the watermark point is embedded with only one or a part of the audio samples in the embedding space (in this embodiment, with one audio sample, the position of the embedded audio sample is determined by the "position information" in step a 22); secondly, since the number of audio samples is often not divisible by the number of watermark points, the number of unit audio samples can be determined in a round-down manner. For example, when implemented by matlab, it may be:

N＝floor(Wavlen/n)；

in the above implementation, N is the number of audio samples in units, Wav L en is the number of audio samples, and N is the number of watermark points.

Step A214, determining the number of required audio samples corresponding to the watermark sequence according to the number of the unit audio samples and the number of the watermark points;

in this embodiment, when determining the number of unit audio samples, the number of unit audio samples may be multiplied by the number of watermark points, so as to determine the number of required audio samples corresponding to the entire watermark sequence, in other words, the number of unit audio samples is the number of audio samples corresponding to a single watermark point (in the table, the embedding space of the single watermark point), and the number of unit audio samples is multiplied by the number of watermark points, so as to determine the number of required audio samples corresponding to the entire watermark sequence (representing the embedding space of the entire watermark sequence). For example, if the number of audio samples per unit is 3 and the number of watermark points is 3, the number of audio samples is required to be 9, that is, 9 consecutive audio samples are required to be used as an embedding space for storing the whole watermark sequence (of course, the 9 audio samples are not necessarily all to be embedded). For example, when implemented by matlab, it may be:

All_car_length＝n*N；

in the above implementation, All _ car _ length is the number of required audio samples.

Step A215, segmenting the audio to be processed according to the number of the required audio samples to obtain corresponding watermark carrier segments and non-carrier segments;

in this embodiment, when the number of required audio samples is determined, the audio to be processed may be segmented according to the number of required audio samples to obtain corresponding watermark carrier segments, where the number of samples of the watermark carrier segments is equal to the number of required audio samples, and the rest of the audio to be processed is a non-carrier segment; for example, the whole audio to be processed includes 11 audio samples, and the number of the required audio samples is 9, then the audio segment where the first 9 audio samples are located is divided into watermark carrier segments, the audio segment where the last 2 audio samples are located is divided into non-carrier segments, for the watermark carrier segments, it can be considered as the embedding space of the whole watermark sequence, the subsequent embedding process will be performed on the watermark carrier segments, and the non-carrier segments can be kept unchanged. For example, when the segmentation is implemented by matlab, it may be:

Carrier＝A(1:All_car_length,1)；

Other＝A(All_car_length+1:end,1)；

in the above implementation, Carrier is the watermark Carrier segment, and Other is the non-Carrier segment.

Correspondingly, after the watermark carrier segment is obtained, the watermark sequence can be embedded into the watermark carrier segment, and then the embedded segment and the non-carrier segment are combined to obtain the watermark audio. It should be noted that the segmentation in this embodiment is to divide a whole segment of audio to be processed into a watermark carrier segment in the first half and a non-carrier segment in the second half, so that when merging, the embedded segment is used as the front part, and the non-carrier segment is used as the rear part to be spliced, so as to obtain the watermark audio.

Through the mode, the audio to be processed is firstly segmented according to the number of the audio samples and the watermark information amount, and then the encrypted watermark image is embedded in a part of the audio segments, so that the data processing amount is favorably reduced, the change of the audio to be processed can be reduced, and the original characteristics of the audio to be processed are kept as much as possible.

Based on the above third embodiment of the audio processing method, a fourth embodiment of the audio processing method of the present invention is proposed.

In this embodiment, the step a22 includes:

step A221, determining the corresponding position information of the watermark point in the watermark carrier segment according to the unit audio sample number;

in this embodiment, after obtaining the watermark carrier segment, the PC further determines the corresponding location information of the watermark point in the watermark carrier segment according to the information amount of the encrypted watermark image and the audio to be processed, where the location information can be used to characterize the location where the watermark point is to be embedded, that is, which audio sample point in the watermark carrier segment the watermark point is to be embedded in; since the unit audio sample number has been obtained from the information amounts of the encrypted watermark image and the audio to be processed in the above-described steps a211-a213, the position information can be determined by the unit audio sample number. Specifically, the number of unit audio samples is counted as N; if N is 1, the position information can be determined to be 1, namely embedding processing is carried out from the first audio sample point of the watermark carrier segment; if N is greater than 1, half of N may be used as the location information, that is, the location information is N/2, and certainly N may not be divisible by 2, so that it may be rounded up; of course, if N is zero, it is possible that the audio is too short or the watermark information is too long to embed, at which point a prompt may be made. For example, when performed by matlab, it may be:

in the above implementation, Index is location information.

Correspondingly, the step a31 includes:

step A311, determining an embedding position in the watermark carrier segment according to the position information, and embedding each watermark point into the watermark carrier segment according to the embedding position to obtain an embedded segment.

In this embodiment, when the position information is determined, an embedding position may be determined in the watermark carrier segment according to the position information, and then each watermark point is embedded into the watermark carrier segment according to the embedding position, so as to obtain an embedded segment. For example, if the number of unit audio samples is 3 and the number of watermark points is 3, the number of required audio samples is 9, that is, the watermark carrier segment includes 9 arranged audio samples; as can be seen from step a221, the position information is 3/2 rounded up, i.e. 2, and thus the 2 nd, 5 th, and 8 th audio samples in the watermark carrier segment can be determined to be embedded audio samples. Of course, the specific determination process may be implemented in various ways; for example, the 2 nd audio sample in the watermark carrier segment is determined as the embedded audio sample according to the position information 2, then the audio sample is counted from the next audio sample (the 3 rd audio sample), the audio sample (the 5 th audio sample) when 3 (unit audio sample) is counted is the 2 nd embedded audio sample, then the audio sample is counted from the next audio sample (the 6 th audio sample), and the audio sample (the 8 th audio sample) when 3 (unit audio sample) is counted is the 3 rd embedded audio sample; for another example, the whole watermark carrier segment may be equally divided into 3 original cells according to the number of unit audio samples 3 (or the watermark carrier segment is equally divided into 3 sub-segments), each original cell includes 3 audio samples, each original cell is used to embed a watermark point, that is, the 1 st to 3 rd audio samples of the watermark carrier segment are the 1 st original cells (that is, the 1 st sub-segment of the watermark carrier segment is used to embed the 1 st watermark point), the 4 th to 6 th audio samples are the 2 nd original cells (that is, the 2 nd sub-segment of the watermark carrier segment is used to embed the 2 nd watermark point), the 7 th to 9 th audio samples are the 3 rd original cells (that is, the 3 rd sub-segment of the watermark carrier segment is used to embed the 3 rd watermark point), then the 2 nd audio sample in each original cell is determined to be the embedded audio sample according to the position information 2, wherein, the 2 nd audio sample in the 1 st original cell is the 2 nd audio sample of the whole watermark carrier segment, the 2 nd audio sample in the 2 nd original cell is the 5 th audio sample of the whole watermark carrier segment, and the 2 nd audio sample in the 3 rd original cell is the 8 th audio sample of the whole watermark carrier segment. Of course, the determination may be made in other ways than the above examples.

Through the mode, the position corner mark is determined according to the unit audio sampling point number, and then the watermark point is embedded according to the position corner mark, so that the embedding of the watermark point has a certain rule, and the subsequent extraction of the watermark point is facilitated; in addition, the method is favorable for avoiding the great negative influence of the disordered embedding on the original audio.

Based on the fourth embodiment of the audio processing method described above, a fifth embodiment of the audio processing method of the present invention is proposed.

In this embodiment, the step a311 includes:

a3111, segmenting the watermark carrier segment according to the number of unit audio sample points to obtain original cells corresponding to each watermark point one to one, wherein the number of the original cells is equal to the number of the watermark points, and the number of the audio sample points included in each original cell is equal to the number of the unit audio sample points;

in this embodiment, when the PC performs watermark embedding, the PC first segments the watermark carrier segment according to the number of unit audio samples to obtain a plurality of original cells C, where each original cell C has a one-to-one correspondence with each watermark point, the number of original cells C is equal to the number of watermark points, and the number of audio samples included in each original cell C is equal to the number of unit audio samples; for each original cell C, one can consider the storage location of one watermark point. For example, if the number of unit audio samples is 3 and the number of watermark points is 3, the number of required audio samples is 9, that is, the watermark carrier segment includes 9 arranged audio samples; according to step a221, the position information is 3/2 rounded up, i.e. 2; when the watermark carrier segment is segmented, the whole watermark carrier segment is averagely divided into 3 original cells (or understood as sub-segments) according to the number 3 of audio samples in a unit, each original cell comprises 3 audio samples, each original cell is used for embedding a watermark point, namely, the 1 st to 3 rd audio samples of the watermark carrier segment are the 1 st original cells (used for embedding the 1 st watermark point), the 4 th to 6 th audio samples are the 2 nd original cells (used for embedding the 2 nd watermark point), and the 7 th to 9 th audio samples are the 3 rd original cells (used for embedding the 3 rd watermark point). When implemented by matlab, it can be:

in the above implementation, C is an original cell, N is a unit number of audio samples, and the number of audio samples included in each original cell is equal to the unit number of audio samples.

Step A3112, performing domain transformation on each original cellular to obtain domain-transformed cellular corresponding to each original cellular;

after the original cells are obtained, the PC respectively carries out domain transformation on each original cell to obtain each domain transformation cell corresponding to each original cell, so that the audio sampling points are converted from a space domain to a transformation domain and then are embedded to weaken or remove the negative influence of data correlation; in the transformation process, one original cell corresponds to one domain transformation cell. The domain transform in this embodiment uses a two-dimensional discrete cosine transform DCT, so that the audio samples are transformed from a spatial domain to a DCT transform domain and then embedded, so as to reduce or remove the negative effects of data correlation and enhance the robustness of the audio carrier. The principle of the two-dimensional discrete cosine transform DCT is as follows:

wherein M, k is 0,1, …, M-1; n, 1-0, 1, …, N-1

Wherein the content of the first and second substances,

the process of the original cellular two-dimensional discrete cosine transform DCT can be realized by means of related library functions. For example, when implemented by matlab, it may be:

in the above implementation, the DCT is a domain transform cell. C is the original unit cell.

Step A3113, determining embedding positions in the domain transformation cells according to the position information, and embedding the watermark points into the embedding positions of the corresponding domain transformation cells to obtain the embedded cells;

in this embodiment, after the domain transformed cells are obtained, the embedding positions may be determined in the domain transformed cells according to the position information, and then the watermark points are respectively embedded into the embedding positions of the corresponding DCT cells to obtain the embedded cells.

Specifically, when the transformation embedding is performed, a first ordering of each watermark point in the watermark sequence is determined, and a second ordering of each domain transformation cell is determined according to the position of each original cell in the watermark carrier segment. For example, if there are watermark sequences (a, b, c), the first sequence of three watermark points a, b, c is 1, 2, 3; the watermark carrier segment comprises 9 audio samples, wherein the 1 st to 3 rd audio samples are the 1 st original cell X (namely the second sequence of the original cell X is 1), the 4 th to 6 th audio samples are the 1 st original cell Y (namely the second sequence of the original cell Y is 2), and the 7 th to 9 th audio samples are the 3 rd original cell Z (namely the second sequence of the original cell Z is 3); the original cells X, Y, X are domain transformed to obtain domain transformed cells X ', Y', and Z ', and the second sequence of the original cells before and after transformation and the corresponding domain transformed cells is the same, i.e. the second sequence of the domain transformed cells X', Y ', and Z' is 1, 2, and 3 in sequence. And according to the first sequence and the second sequence, determining corresponding same-sequence domain transformation cells of each watermark point, wherein the first sequence of each watermark point is the same as the second sequence of the corresponding same-sequence domain transformation cells. For example, if the first ordering of the watermark points a is 1, the domain transform unit cell in the same order is X'; if the first ordering of the watermark points b is 2, the domain transform unit cells in the same order are Y'; the first ordering of the watermark points c is 3, then the domain transform unit cell of the same order is Z'. After the corresponding same-sequence domain transformation unit cells are determined, the watermark points can be respectively embedded into the embedding positions of the corresponding same-sequence domain transformation unit cells to obtain the embedded unit cells. For example, in the above, the watermark point a is embedded into the embedding position where the homogeneous domain transformed cell is X', so as to obtain an embedded cell X "; embedding the watermark point b into an embedding position where the domain transformation cell is Y 'to obtain an embedded cell Y'; and embedding the watermark point c into an embedding position of the domain transformation cellular Z ', so as to obtain an embedded cellular Z'.

It is worth mentioning that for each embedded unit cell, the same position information is corresponded. For ease of understanding, this is again exemplified. For example, in the example of the step a3111, the number of unit audio samples is 3, the location information is 2, when the watermark carrier segment is segmented, the entire watermark carrier segment is equally divided into 3 original cells (or understood as sub-segments) according to the number of unit audio samples 3, each original cell includes 3 audio samples, each original cell is used for embedding a watermark point, that is, the 1 st to 3 rd audio samples of the watermark carrier segment are the 1 st original cell (used for embedding the 1 st watermark point), the 4 th to 6 th audio samples are the 2 nd original cell (used for embedding the 2 nd watermark point), and the 7 th to 9 th audio samples are the 3 rd original cell (used for embedding the 3 rd watermark point); performing domain transform (DCT) on the original cells to obtain domain transform cells, and then determining the 2 nd audio sample in each domain transform cell as an embedded audio sample according to the position information 2; wherein, the 2 nd audio sample point in the 1 st domain transformation cellular corresponds to the 2 nd audio sample point of the whole watermark carrier segment, the 2 nd audio sample point in the 2 nd domain transformation cellular corresponds to the 5 th audio sample point of the whole watermark carrier segment, and the 2 nd audio sample point in the 3 rd original cellular corresponds to the 8 th audio sample point of the whole watermark carrier segment. After the embedded audio sample point is determined, watermark point embedding may be performed, and when embedding, the value of the watermark point and the sample point value at the embedding position may be calculated to obtain an embedded value, although the calculation mode may be defined according to the actual situation, or other modes may be adopted to characterize the "embedding" operation.

Further, in this embodiment, the embedding is described by taking multiplication as an example. Specifically, after the one-dimensional watermark sequence is obtained in step a212, a preset key sequence is first generated based on a preset algorithm, and then the watermark sequence is subjected to xor transformation through the preset key sequence to obtain a transformed watermark sequence, where a value of a transformed watermark point in the transformed watermark sequence is 0 or 1. By such transformation, the difference between watermark points with different values is kept, the difference amplitude can be reduced, and the audio sampling points are prevented from being greatly changed in the embedding process. When embedding, firstly, determining a sampling point initial value of an embedding position in a domain transformation unit cell according to position information; then, according to the initial value of the sampling point and the value of the corresponding transformed watermark point in the transformed watermark sequence, determining an embedded value of an embedded position, so as to obtain an embedded cell according to the embedded value, specifically, bringing the initial value of the sampling point and the value of the corresponding transformed watermark point into a preset embedding formula, and calculating to obtain the embedded value of the embedded position, so as to complete the embedding of the transformed watermark point, so that the embedded cell is obtained according to the embedded value, and the embedded cell can be called an embedded cell and is marked as Coeff; wherein the preset embedding formula is as follows:

Coeff(i)(Index)＝(DCT(i)(Index))*(1+2Flag(i))

For example, when performed by matlab, it may be:

coeff ═ cell (n, 1); % definition DCT Embedded cell Coeff

Coeff＝DCT；

for i＝1:n

Coeff { i,1} (Index) ═ (DCT { i,1} (Index)) (1+2 × (i)); % embedding end by multiplication

In the above implementation, Coeff is the embedded cell, Index is the position information,

by the watermark embedding in the multiplication mode, the difference between watermark points with different values is kept, the difference amplitude can be reduced, and the audio sampling points are prevented from being greatly changed in the embedding process.

The method comprises the following steps of carrying out XOR transformation on a watermark sequence through a preset sequence, wherein the length of the watermark sequence can be expanded through a frequency spreading mode to expand the information content of watermark information, specifically, in the process of producing a key, the length of the key is determined according to the length of the watermark sequence, corresponding key sequences are generated according to the length of the key (the specific value of the key can be generated randomly), wherein the length of the key is p times of the length of the watermark sequence, p is a positive integer, the length of the watermark sequence is also the number of watermark points in the watermark sequence, for example, the watermark sequence comprises 3 watermark points, namely the length of the watermark sequence is 3, then, based on the multiple relation between the length of the key and the length of the watermark sequence, the watermark sequence is obtained through frequency spreading XOR transformation on the key sequence, so as to obtain a transformed watermark sequence, and the third ordering of the watermark points in the watermark sequence is determined, and the fourth ordering of the key elements in the preset key sequence is determined, the watermark sequence is obtained, the corresponding watermark information is obtained through the number of the watermark elements, i.p + 7 k2, the watermark sequence, the key sequence is equivalent to obtain a watermark element, the number of the watermark elements, p + 7, the watermark sequence is equal to the number of the watermark elements, the watermark sequence, the number of the watermark elements, the number of the watermark elements, the watermark elements are equivalent to obtain the watermark sequence, the number of the watermark elements, the watermark elements are equivalent to the watermark elements, the watermark sequence is equal to the number of the watermark elements, the number of the watermark information is equal to the number of the watermark elements, the number of the watermark elements, the number of the watermark elements, the watermark elements is equal to the number of the watermark elements, the number of the watermark elements is equal to be equal to the number of the watermark elements is equal to No. 7 is equal to No. 7, the number of the:

in the above implementation, L ength is the length of the watermark sequence, n is the length of the key sequence, n is 2 times L ength, flag (k) is the value of the transformed watermark point, and k may take 1, 2, ·, n.

By the method, the watermark sequence is subjected to spread spectrum XOR transformation by using the key sequence with the multiple length, the length of the watermark sequence is expanded, and the transformed watermark sequence is embedded in the subsequent embedding operation, so that the information content of the embedded watermark information is enlarged, and more copyright information is added to the audio.

Step A3114, inverse domain transformation is performed on each embedded cell to obtain each inverse transformation cell;

when the embedded cellular Coeff is obtained, inverse domain transformation can be performed on the embedded cellular Coeff so as to convert the embedded audio samples from a transform domain to a spatial domain, wherein the inverse domain transformation and the inverse domain transformation are inverse operations. In the present embodiment, the domain transform is to use a two-dimensional discrete cosine transform DCT, so that the inverse transform of the domain in this step is to use an inverse two-dimensional discrete cosine transform IDCT to obtain each inverse transform unit cell. The two-dimensional Inverse Discrete Cosine Transform (IDCT) is the inverse operation of the two-dimensional DCT, and the principle is as follows:

wherein M, k is 0,1, …, M-1; n, 1-0, 1, …, N-1

Wherein the content of the first and second substances,

for the two-dimensional inverse discrete cosine transform IDCT, it can be implemented by means of a related library function. For example, when implemented by matlab, it may be:

in the above implementation, IDCT is the inverse transform cell, and Coeff is the embedded cell.

Step A3115, obtaining an embedded segment from each of the inverse transformed cells

When the inverse transformation cells are obtained, the inverse transformation cells can be combined, so that the embedded watermark carrier audio is obtained.

In this embodiment, by performing the binary discrete cosine transform and then performing the watermark embedding, the embedding point is enhanced in volume amplitude, so as to prevent the embedded watermark from generating no aggression to the original audio and damaging the data characteristics of the original audio. Specifically, as shown in fig. 4, fig. 4 is a waveform diagram of the front and rear sound signals embedded by the watermark in the embodiment; it can be seen that the watermarked waveform shows an increase in volume amplitude at some audio samples, but still retains most of the data characteristics of the original audio.

For ease of understanding, the details of the flow of the watermark embedding step of step S20 in this embodiment will be described again below:

obtaining a one-dimensional watermark sequence according to the encrypted watermark image;

and carrying out spread spectrum transformation on the one-dimensional watermark sequence to obtain a transformed watermark sequence. The method specifically comprises the following steps: determining the length of a key according to the length of the watermark sequence, and generating a corresponding key sequence according to the length of the key, wherein the length of the key is p times of the length of the watermark sequence, and p is a positive integer; based on the multiple relation between the key length and the length of the watermark sequence, performing spread spectrum XOR transformation on the watermark sequence through the key sequence to obtain a transformed watermark sequence; performing XOR operation on one watermark point in the watermark sequence and p key elements of corresponding bits in the key sequence for p times respectively to obtain p transformation watermark points correspondingly; transforming the value of the watermark point to 1 or 0;

determining the number of watermark points in the transformed watermark sequence, and determining the number of unit audio sample points corresponding to each watermark point according to the number of the audio sample points and the number of the watermark points;

determining the required audio sample number corresponding to the conversion watermark sequence according to the unit audio sample number and the number of the watermark points;

segmenting the audio to be processed according to the required audio sample number to obtain corresponding watermark carrier segments and non-carrier segments;

performing two-dimensional Discrete Cosine Transform (DCT) on each original cellular to obtain each domain transform cellular;

carrying out two-dimensional Inverse Discrete Cosine Transform (IDCT) on each embedded cellular to obtain each inverse discrete cosine transform cellular;

combining the inverse transformation cells to obtain an embedded segment;

and combining the embedded segment and the non-carrier segment to obtain the watermark audio.

Further, for the above watermark embedding process, the extraction of the encrypted watermark image may be realized in an inverse transformation manner, specifically:

obtaining the encrypted watermark image according to the watermark sequence;

Further, when the encrypted watermark image is obtained, the encrypted watermark image can be decrypted to obtain the original watermark image. In practical application, when an encrypted watermark image is extracted from an audio (or further decrypted to obtain a watermark image), the encrypted watermark image can be compared with an un-embedded encrypted watermark image (or an original watermark image), and if the two images are consistent, the copyright of the audio can be determined to belong to.

In addition, the embodiment of the invention also provides an audio processing device.

In this embodiment, the audio processing apparatus includes:

the watermark acquisition module is used for acquiring an original watermark image and encrypting the original watermark image to obtain an encrypted watermark image;

the position determining module is used for determining the position information embedded by the encrypted watermark image according to the information quantity of the encrypted watermark image and the audio to be processed;

and the watermark embedding module is used for embedding the encrypted watermark image into the audio to be processed according to the position information to obtain a watermark audio.

The virtual function modules of the audio processing apparatus are stored in the memory 1005 shown in fig. 1, and when executed by the processor 1001, the virtual function modules realize the audio processing function.

Further, the watermark obtaining module includes:

the first transformation unit is used for acquiring an original watermark image and performing signal decomposition on the original watermark image to acquire low-frequency and high-frequency information corresponding to the original watermark image;

the component encryption unit is used for encrypting the low-frequency and high-frequency information to obtain encrypted information;

and the second transformation unit is used for reconstructing the signal according to the encryption coefficient to obtain a corresponding encrypted watermark image.

Further, the low frequency and high frequency information includes an approximation component and a detail component,

correspondingly, the component encryption unit is specifically configured to determine the number of rows and columns of the approximate component, and determine an encryption component corresponding to the approximate component according to the number of rows and columns and a preset parameter; and obtaining encryption information according to the encryption component and the detail component.

Further, the position determination module includes:

the audio segmenting unit is used for acquiring the audio to be processed and segmenting the audio to be processed according to the information content of the encrypted watermark image and the audio to be processed to obtain a watermark carrier segment and a non-carrier segment;

a position determining unit for determining the position information of the watermark carrier segment embedded by the encrypted watermark image according to the information content of the encrypted watermark image and the audio to be processed

Correspondingly, the watermark embedding module comprises:

and the watermark embedding unit is used for embedding the encrypted watermark image into the watermark carrier segment according to the position information to obtain an embedded segment, and combining the embedded segment and the non-carrier segment to obtain a watermark audio.

Further, the audio segmentation unit includes:

the audio acquisition subunit is used for acquiring audio to be processed and determining the number of audio samples of the audio to be processed;

the sequence acquisition subunit is used for acquiring a one-dimensional watermark sequence according to the encrypted watermark image and determining the number of watermark points in the watermark sequence;

the first determining subunit is used for determining the number of unit audio sample points corresponding to each watermark point according to the number of the audio sample points and the number of the watermark points;

the second determining subunit is used for determining the required audio sample number corresponding to the watermark sequence according to the unit audio sample number and the number of the watermark points;

the audio segmentation subunit is used for segmenting the audio to be processed according to the required audio sample number to obtain corresponding watermark carrier segments and non-carrier segments;

further, the position determination unit further includes:

the position determining subunit is used for determining the corresponding position information of the watermark point in the watermark carrier segment according to the unit audio sample number;

correspondingly, the watermark embedding unit is specifically configured to determine an embedding position in the watermark carrier segment according to the position information, and embed each watermark point into the watermark carrier segment according to the embedding position to obtain an embedded segment.

Further, the watermark embedding unit includes:

the segment dividing subunit is used for dividing the watermark carrier segment according to the unit audio sample number to obtain original cells corresponding to each watermark point one by one, wherein the number of the original cells is equal to the number of the watermark points, and the number of the audio sample points included in each original cell is equal to the unit audio sample number;

the first transformation subunit is used for carrying out domain transformation on each original cellular to obtain domain transformation cellular respectively corresponding to each original cellular;

the watermark embedding subunit is used for determining an embedding position in each domain transformation cellular according to the position information and respectively embedding each watermark point into the embedding position of the corresponding domain transformation cellular to obtain each embedded cellular;

the second transformation subunit is used for carrying out inverse domain transformation on each embedded cellular to obtain inverse transformation cellular;

and the cellular combination subunit is used for obtaining the embedded segment according to each inverse transformation cellular.

Further, the watermark embedding subunit is specifically configured to determine a first ordering of each watermark point in the watermark sequence, and determine a second ordering of each domain transformation cell according to a position of each original cell in the watermark carrier segment; according to the first ordering and the second ordering, determining corresponding same-sequence domain transformation cells of each watermark point, wherein the first ordering of each watermark point is the same as the second ordering of the corresponding same-sequence domain transformation cells; and embedding the watermark points into the embedding positions of the corresponding same-sequence domain transformation cells respectively to obtain the embedded cells.

Further, the audio segmentation unit further includes:

the watermark transformation subunit is used for carrying out XOR transformation on the watermark sequence through a preset key sequence to obtain a transformed watermark sequence;

the watermark embedding subunit is specifically configured to determine a sampling point initial value of an embedding position in the domain transformation unit according to the position information; and determining an embedded value of an embedded position according to the sampling point initial value and the value of the corresponding transformed watermark point in the transformed watermark sequence so as to obtain an embedded cellular according to the embedded value.

Further, the step of calculating the sampling point initial value and the value of the corresponding transformed watermark point in the transformed watermark sequence to obtain the embedded value of the embedded position includes:

substituting the initial value of the sampling point and the value of the corresponding transformed watermark point into a preset embedding formula, and calculating to obtain an embedded position embedding value, wherein the preset embedding formula is as follows:

Coeff(i)(Index)＝(DCT(i)(Index))*(1+2Flag(i))

Further, the watermark transformation subunit determines a key length according to the length of the watermark sequence, and generates a corresponding preset key sequence according to the key length, where the key length is p times the length of the watermark sequence, and p is a positive integer; and based on the multiple relation between the key length and the length of the watermark sequence, performing spread spectrum XOR transformation on the watermark sequence through the preset key sequence to obtain a transformed watermark sequence, wherein one watermark point in the watermark sequence corresponds to p transformed watermark points in the transformed watermark sequence.

The watermark transformation subunit is specifically configured to determine a third ordering of each watermark point in the watermark sequence and determine a fourth ordering of each key element in the preset key sequence, perform p-times exclusive-or operation on a t-th watermark point and (t-1) p +1, (t-1) p + 2), and tp-th key elements in sequence according to the third ordering and the fourth ordering, correspondingly obtain (t-1) p +1, (t-1) p +2, and p-th transformation watermark points, wherein t is 1, 2, and L are the number of watermark points in the watermark sequence, and arrange each transformation watermark point according to the ordering of each transformation watermark point to obtain a transformation watermark sequence.

Further, the audio processing apparatus further includes:

the watermark extraction module is used for segmenting the watermark audio according to the number of the required audio samples to obtain the embedded segment and the non-carrier segment; dividing the embedded segment according to the number of the unit audio samples to obtain the inverse transformation cells; performing the domain transformation on the inverse transformation cells to obtain the embedded cells; determining the embedding value of the embedding unit cell according to the position information; calculating the embedded value and the corresponding initial value of the sampling point to obtain the value of the transformed watermark point; combining the values of the transformed watermark points to obtain the transformed watermark sequence, and performing XOR transformation on the transformed watermark sequence through the preset key sequence to obtain the watermark sequence before transformation; obtaining the encrypted watermark image according to the watermark sequence; and decrypting the encrypted watermark image to obtain the original watermark image.

The function implementation of each module of the audio processing apparatus corresponds to each step in the embodiment of the audio processing method, and the function and implementation process thereof are not described in detail herein.

In addition, the embodiment of the invention also provides a readable storage medium.

The readable storage medium of the invention has stored thereon a computer program which, when being executed by a processor, carries out the steps of the audio processing method as described above.

The method implemented when the computer program is executed can refer to the embodiments of the audio processing method of the present invention, and is not described herein again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) as described above and includes instructions for enabling a terminal device (e.g., a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.

The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims

1. An audio processing method, characterized in that the audio processing method comprises:

2. The audio processing method as claimed in claim 1, wherein the step of obtaining the original watermark image and encrypting the original watermark image to obtain the encrypted watermark image comprises:

3. The audio processing method of claim 2, wherein the low frequency and high frequency information includes an approximation component and a detail component,

4. The audio processing method according to claim 1, wherein the step of obtaining the audio to be processed and determining the position information in which the encrypted watermark image is embedded according to the information amount of the encrypted watermark image and the audio to be processed comprises:

5. The audio processing method according to claim 4, wherein the step of obtaining the audio to be processed and segmenting the audio to be processed according to the information amount of the encrypted watermark image and the audio to be processed to obtain a watermark carrier segment and a non-carrier segment comprises:

6. The audio processing method according to claim 5, wherein the step of determining the position information of the watermark carrier segment in which the encrypted watermark image is embedded according to the information amount of the encrypted watermark image and the audio to be processed comprises:

7. The audio processing method according to claim 6, wherein the step of determining an embedding position in the watermark carrier segment according to the position information and embedding each watermark point into the watermark carrier segment according to the embedding position to obtain an embedded segment comprises:

8. The audio processing method according to claim 7, wherein the step of embedding each watermark point distribution into the embedding position of the corresponding domain transformation cell to obtain each embedded cell comprises:

9. The audio processing method of claim 7, wherein the step of deriving a one-dimensional watermark sequence from the encrypted watermark image further comprises:

10. The audio processing method of claim 9, wherein the step of determining the embedding value of the embedding location based on the sample point initial value and the value of the corresponding transformed watermark point in the transformed watermark sequence comprises:

Coeff(i)(Index)＝(DCT(i)(Index))*(1+2Flag(i))

11. The audio processing method as claimed in claim 9, wherein the step of performing xor transformation on the watermark sequence by using the predetermined key sequence to obtain the transformed watermark sequence further comprises:

12. The audio processing method as claimed in claim 11, wherein the step of performing a spread-spectrum xor transformation on the watermark sequence by using the predetermined key sequence to obtain a transformed watermark sequence comprises:

13. The audio processing method according to claim 9, wherein after the step of embedding the encrypted watermark into the audio segment corresponding to the audio to be processed to obtain the watermarked audio, the method further comprises:

obtaining the encrypted watermark image according to the watermark sequence;

14. An audio processing device, characterized in that the audio processing device comprises a processor, a memory, and a computer program stored on the memory and executed by the processor, wherein the computer program, when executed by the processor, implements the steps of the audio processing method according to any one of claims 1 to 13.

15. A storage medium, characterized in that the storage medium has stored thereon a computer program, wherein the computer program, when executed by a processor, carries out the steps of the audio processing method according to any of claims 1 to 13.