WO2009107054A1 - Procédé d'intégration de données dans une image stéréo - Google Patents
Procédé d'intégration de données dans une image stéréo Download PDFInfo
- Publication number
- WO2009107054A1 WO2009107054A1 PCT/IB2009/050726 IB2009050726W WO2009107054A1 WO 2009107054 A1 WO2009107054 A1 WO 2009107054A1 IB 2009050726 W IB2009050726 W IB 2009050726W WO 2009107054 A1 WO2009107054 A1 WO 2009107054A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- spatial image
- image parameter
- icc
- altered
- audio signal
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 230000005236 sound signal Effects 0.000 claims abstract description 105
- 238000013507 mapping Methods 0.000 claims description 43
- 239000011159 matrix material Substances 0.000 claims description 42
- 238000010276 construction Methods 0.000 claims description 12
- 230000000694 effects Effects 0.000 claims description 11
- 238000000354 decomposition reaction Methods 0.000 claims description 10
- 230000001131 transforming effect Effects 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 5
- 238000013500 data storage Methods 0.000 claims description 3
- 230000004075 alteration Effects 0.000 abstract description 5
- 230000006870 function Effects 0.000 description 19
- 238000010586 diagram Methods 0.000 description 11
- 238000013139 quantization Methods 0.000 description 10
- 230000003595 spectral effect Effects 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 230000008447 perception Effects 0.000 description 4
- 238000003786 synthesis reaction Methods 0.000 description 4
- 230000002123 temporal effect Effects 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
Definitions
- the present invention relates to a method for embedding a watermark into an audio signal and a method for determining a watermark in an audio signal.
- the method moreover relates to an embedder for embedding a watermark into an audio signal and a detector for determining a watermark in an audio signal.
- the method further relates to a computer program product arranged for enabling a computer system to carry out any of the methods of the invention.
- DRM Dynamic Remote Access Management
- a major part of the DRM systems employ watermark embedding as this prevents uncoupling of rights (as eventually signaled by the watermark) from the content. This way the audio stream itself can be used as rights authentication.
- the content provider can employ forensic tracking towards illegitimate content in an attempt to find the source of distribution.
- SBR Spectral Band Replication
- SBR reconstructs the higher frequencies in the decoder based on an analysis of the lower frequencies transmitted in the underlying coder. To ensure an accurate reconstruction, some guidance information is transmitted in the encoded bit- stream at a very low data rate.
- SBR in combination with AAC is successfully exploited in the market place and also known as HE-AAC or aacPlus.
- the SSC standard also specifies a parametric stereo (PS) tool that extends the parametric coder to stereo.
- PS parametric stereo
- a Parametric Stereo encoder extracts a parametric representation of the stereo image of an audio signal, whereas only a monaural representation of the original signal is encoded in a conventional fashion. The stereo image information is represented as a small amount of high quality parametric stereo information and transmitted along with the monaural signal in the bit stream.
- the decoder is capable of regenerating the stereo image.
- Stereo parameters comprise inter-channel intensity differences (HD), inter-channel correlation or coherence (ICC) and optionally inter-channel time or phase differences (ITD/IPD).
- the PS tool can be combined with codecs such as HE-AAC. This has resulted in HE-AAC v2 or enhanced aacPlus, a coder that provides a significant reduction in bit-rates that has enabled cost effective mobile music downloads.
- an MPEG Surround decoder is able to upmix a (mono or stereo) downmix to multi-channel by means of spatial image parameters.
- an improved method of embedding a watermark into an audio signal would be advantageous, and in particular a more efficient and/or robust method of embedding a watermark into an audio signal and/or a method of embedding a watermark into an audio signal having increased capacity would be advantageous. Moreover, an improved method of embedding a watermark into an audio signal independently of coding scheme of the audio signal would be advantageous.
- said at least one original spatial image parameter comprising the Inter-channel Intensity Difference and/or the Inter-channel Correlation
- the method of the invention in contrast relies on statistical aspects of the stereo image in the form of parameters derived from the at least one original spatial image parameter.
- the method relates to embedding a watermark into at least one spatial image parameter, viz. at least one parameter related to the stereo image of the audio signal and/or into parameters derived from such at least one spatial image parameter.
- parameters could e.g. be the Eigen values of the normalized covariance matrix of the audio signal or their ratio, or it could be one or both of the original spatial image parameters themselves.
- Embedding a watermark into the at least one original spatial image parameter or into parameters derived from the spatial image parameters of a stereo or multichannel signal may provide a small or unperceivable alteration of the stereo or multichannel image, viz the perceptual attributes that construct the spatial perception of the stereo or multichannel signal.
- the altered signal pair is to be only slightly different from the original signal pair, with a slightly altered stereo or spatial image.
- perceptually the altered and the original signal pairs should be identical due to the perceptual limitations of the human auditory system.
- the invention is particularly, but not exclusively, advantageous in that it renders it possible to embed a watermark in stereo or multichannel audio signals, irrespective of which coding scheme has been used for coding the audio signal.
- the method of the invention may also be used for embedding a watermark into a parametrically coded stereo or multichannel signal.
- the invention is moreover advantageous in that it provides a robust embedding of a watermark into a stereo or multichannel audio signal.
- the method may comprise an additional step between step (a) and step (b), where the additional step comprises transforming the time domain signal to frequency domain, and wherein the method may comprise a further step of transforming the watermarked signal back to time domain.
- the frequency domain signal may be analyzed in bands, preferably resembling the human auditory system, and the embedding may be performed on each band.
- the construction in step (d) of the method comprises multiplying the signal pair by a construction matrix to construct the altered signal pair conforming to the altered at least one spatial image parameter.
- the construction in step (d) comprises multiplying the signal pair by a construction matrix, where the construction matrix is obtainable as a product of a decomposition matrix and a reconstruction matrix, where the decomposition matrix is arranged to, when multiplied to the signal pair, decompose the signal pair into two substantially orthogonal signals and where the reconstruction matrix is arranged to, when multiplied to the substantially orthogonal signals, to construct the altered signal pair conforming to the altered at least one spatial image parameter.
- step (c) comprises the steps of:
- the quantizing in step (cl) may correspond to a quantization step size, and the data embedded into the at least one quantized spatial image parameter may preferably be smaller than the quantization step size in step (cl).
- the data to be embedded are smaller than the quantization step size. If for example, the quantization step size is based on the Just Noticeable Differences (JND), this would lead to imperceptible differences between the audio signal before and after watermarking.
- the dequantizing step (c3) may therefore correspond to a quantization step size that is smaller than the quantization step size in (cl).
- step (c) of the method comprises the steps of:
- step (c4) where P x and P y represent the power of signal x and y respectively, P xy represents the non- normalized cross-correlation, example mapping functions as in step (c4) are given by:
- step (c6) The inverse mapping functions of step (c6) are in this case given by:
- the human auditory system is not equally sensitive to changes of the stereo parameters on the HD on a dB grid, or the ICC on a linear representation. For example, at intensity differences around 0 (left and right are equally strong) one is more sensitive to changes than e.g. at +20 dB (left much stronger than right).
- the parameters are manipulated in this domain since it is possible to predict the perceptual effect.
- the inverse mapping function is applied to get back to the original domain. This way mapping the original spatial image parameters, so that the perceptual effect of the mapping is substantially linear, provides a watermarking which is imperceptible or almost imperceptible.
- the step (c5) of embedding data comprises: associating the data to be embedded with a predetermined pattern; and embedding the predetermined pattern.
- each watermark symbol is associated with a pattern, and this pattern is then embedded.
- the watermark symbol 0 may be e.g. associated to the pattern [a, -a, a, -a, ...] whereas the symbol 1 may be associated with [-a, a, -a, a, ...].
- the data to be embedded in step (c5) are moreover encoded by a pseudo random bit sequence, where the data to be embedded comprises the pseudo random bit sequence shifted cyclically by an offset represented by the data payload.
- the data to be encoded could consist of a cyclically rotated (time varying) pattern:
- Symbol 0 could correspond to [0.3, 0.2, -0.2, 0.1, -0.5, 0.2, -0.1, 0.1]
- Symbol 1 could correspond to [0.1, 0.3, 0.2, -0.2, 0.1, -0.5, 0.2, -0.1]
- Symbol 2 could correspond to [-0.1, 0.1, 0.3, 0.2, -0.2, 0.1, -0.5, 0.2], etc.
- Such patterns may be applied in the time or in the frequency direction (over different bands) or even in time and frequency direction.
- the method further comprises the following steps:
- step (f ⁇ ) transforming the time domain signal (A) to frequency domain, wherein between step (fO) is carried out between step (a) and step (b), and (fl) decomposing the signal pair to two substantially orthogonal signals and transforming the main signal component of the two substantially orthogonal signal components derived in step (d) from the frequency domain to the time domain to obtain a time domain main signal component;
- step (c) of the method comprises:
- the method may further comprise the steps of:
- (clO) performing an inverse mapping operation of the at least one altered mapped spatial image parameter to obtain the at least one altered spatial image parameter, wherein the inverse mapping operation is the inverse operation of the mapping operation.
- step (d) of constructing the watermarked audio signal comprises combining the at least one altered image parameter with the audio the mono bit stream derived from the audio signal by:
- the at least one original spatial image parameter may further comprise the Inter-channel Phase Difference (IPD) and/or the Inter- channel Time Difference (ITD).
- IPD Inter-channel Phase Difference
- ITD Inter-channel Time Difference
- the invention moreover relates to a method of detecting a watermark in an audio signal, where the signal comprises at least one spatial image parameter, where said at least one spatial image parameter comprises the Inter-channel Intensity Difference and/or the Inter-Channel Correlation, the method comprising the steps of: (h) retrieving the audio signal;
- the correlation or probability of fit of the altered spatial image parameters with the possible patterns are determined in order to identify the watermark as the best fitting pattern.
- the method of detecting a watermark encompasses the object of determining whether a watermark is present in the audio signal or not. For example, if the result of step (j) is that the at least one spatial image parameter is equal to a reference spatial image parameter, it may be concluded that no watermark was present in the audio signal. If the result of step (j) was that the at least one spatial image parameter is different from a reference value or that the correlation between it and a reference pattern is different from 1 , then it may be concluded that a watermark was present in the audio signal.
- the wording "determining a watermark" in step (j) is meant to cover the determination of whether a watermark is present or not as well as the determination of the content or effect of the watermark.
- the invention relates to an embedder for embedding a watermark into an audio signal, the embedder comprising: a receiver for receiving the audio signal, said audio signal comprising a signal pair; - a processor arranged for:
- the invention relates to a detector for detecting a watermark from an audio signal, where the audio signal comprises at least one image parameter, where said at least one spatial image parameter comprises the Inter-channel Intensity Difference and/or the Inter-Channel Correlation the detector comprising: means for retrieving the audio signal; a processor arranged for:
- the processor of the detector for detecting a watermark is arranged for determining a watermark.
- determine a watermark is meant to cover the determination of whether a watermark is present or not as well as the determination of the content or effect of the watermark.
- the invention relates to a computer program product being adapted to enable a computer system comprising at least one computer having data storage means associated therewith to carry out one or more of the method according to the invention.
- the embodiments of the method may be implemented as a computer program product being adapted to enable a computer system comprising at least one computer having data storage means in connection therewith to control an embedder or a detector according to the different aspects of the invention.
- a computer program product may be provided on any kind of computer readable medium, or through a network.
- Fig. 1 is a flow chart representing a method for embedding a watermark into an audio signal according to an embodiment of the invention
- Figs. 2 to 4 are functional flow diagrams representing the data flows in watermark embedding methods according to embodiments of the invention
- Figs. 5 and 6 are functional flow diagrams representing data flows in methods of extracting a watermark from a watermarked audio signal according to embodiments of the invention
- Figs. 7 and 8 are flow charts representing the data flows in a method according to embodiments of the invention for embedding a watermark into an audio signal, wherein the method comprises server precoding;
- Fig. 9 is a diagram of an embodiment of a system for embedding and/or extracting a watermark into and/or from an audio signal.
- FIG. 1 is a flow chart representing a method 100 for embedding a watermark into an audio signal according to an embodiment of the invention.
- the method 100 starts in step S and continues to step a, wherein an audio signal A comprising a signal pair 1, r is received.
- the audio signal may be a stereo signal or a multichannel signal.
- step b at least one original spatial image parameter of the audio signal A is derived.
- the at least one original spatial image parameter comprises the Inter-channel Intensity Difference IID and/or the Inter-channel Correlation ICC.
- the at least one original spatial image parameter may additionally comprise the Inter-channel Phase Difference IPD and/or the Inter-channel Time Difference ITD.
- step c a watermark is embedded into parameters derived from the at least one original spatial image parameter IID, ICC in order to obtain at least one altered spatial image parameter IID'
- the parameters derived from the at least one original spatial image parameter IID, ICC may be the one or both of the original spatial image parameters IID, ICC itself/themselves.
- step d a watermarked audio signal A' in constructed from the signal pair 1, r, the watermarked audio signal A' comprising an altered signal pair 1', r' and the watermarked audio signal A' comprising the at least one altered spatial image parameter IID', ICC
- Step E Steps c and d may be carried out in various ways; a few examples are described below.
- Figure 2 is generalized functional flow diagram of the data flow in an embodiment of the method for embedding a watermark into a stereo audio signal.
- An audio signal A comprising a left-right signal pair 1, r is input into block 10, wherein the left-right signal pair 1, r of the audio signal A is decomposed into two orthogonal or nearly orthogonal signals, yi, y 2 , e.g. by use of a matrix operation:
- O is a 2 x 2 matrix with components that are a function of one of or both the two original spatial image parameters: the inter-channel intensity difference HD and an inter-channel correlation ICC. If the phase difference between the left and right channel is taken into account, the correlation can be decomposed into an inter-channel coherence and a third stereo image parameter, an inter-channel phase difference IPD or inter-channel time difference ITD.
- the left-right decomposition can be conducted in a number of ways. One of these is by means of a Singular Value Decomposition (SVD) of the covariance matrix.
- Singular Value Decomposition Singular Value Decomposition (SVD) of the covariance matrix.
- the co variance matrix can be defined as following:
- the covariance matrix C can be expressed in a form C n that is independent of the overall signal level, represented by the two powers of , ⁇ r 2 for the left and right signals, respectively, using their arithmetic mean:
- C n is thus fully defined by the HD ( ⁇ f / ⁇ r 2 ) and ICC ( p fr ) parameters.
- the decomposition operation O is derived from the SVD of C n
- the IID and the ICC of the left-right signal pair 1, r of the audio signal are measured and the C n matrix is calculated.
- SVD is applied in order to obtain the rotation matrix O.
- the rotation matrix O is applied on the left-right signal pair 1, r to obtain the orthogonal or nearly orthogonal pair y l s y 2 .
- the watermark signal is embedded in this space, as explained below, so as to be imperceptible or almost imperceptible.
- the IID and the ICC are quantized, e.g. by means of a quantization operation Q, into the quantized spatial image parameters UO 11 X x and ICCd x .
- the embedding signal ES or the watermark is subsequently added to the quantized spatial image parameters IID lc i x and ICC lc i x in block 12.
- the watermark or embedding signal ES is preferably in the form of small deviations of the quantized spatial image parameters IID lc i x and ICCd x .
- the result of the embedding is the altered quantized spatial image parameters and ICddx+ICCdeita-
- the original sequence of IID quantizer indices was [3, 4, 1, -8, 4, 6, 5, ...]
- this sequence could look like [3.125, 4.0, 0.625, -8.125, 4.375, 6.0, 5.375, ...].
- a three bit value (eight steps) has been encoded.
- the altered quantized spatial image parameters and ICCd x +ICCdeita are converted back to the original domain by means of the inverse mapping Q "1 .
- the result of the inverse mapping Q "1 is the altered stereo parameters IID', ICC.
- IID', ICC a covariance matrix C n ' may be calculated and an SVD can be applied to obtain another rotation matrix O'.
- the inverse O' "1 of this matrix O' is calculated. If the IID and ICC parameters were not changed, the matrix operation O' "1 would be the inverse of the rotation matrix O.
- the original stereo image of the input signal pair 1, r has been slightly altered to the signal pair 1', r' without perceptual consequences. Since this operation does not reinstate the correct IID, the IID can be altered after the matrix operation O' "1 by applying an additional diagonal gain matrix. This will not influence the correlation parameter.
- the orthogonal or nearly orthogonal pair y l s y 2 are applied to the matrix O' "1 of in order to render an audio signal A' having a slightly altered signal pair 1', r'.
- This slightly altered signal pair 1', r' thus constitutes or is comprised in an audio signal A' having a slightly altered spatial image compared to the original audio signal A'.
- the altered signal pair 1', r' is obtainable as a product of the original signal pair 1, r and a construction matrix N, where the construction matrix N is obtainable as a product of a decomposition matrix Nl and a reconstruction matrix N2.
- the original spatial image parameters comprise HC and IDD. However, it should be noted that they may further comprise the Inter- channel Phase Difference (IPD) and/or the Inter-channel Time Difference (ITD).
- IPD Inter- channel Phase Difference
- ITD Inter-channel Time Difference
- Figure 3 is another generalized functional flow diagram of the data flow in an embodiment of the method for embedding a watermark into a stereo audio signal, in the case where the audio signal A is in time domain.
- the elements 10, 11, 12, 13 and 14 of figure 2 are also found in figure 3 and will not be described in further detail again.
- the method disclosed in figure 3 comprises an additional step compared to the method disclosed in figure 2 prior to the decomposition in block 10.
- an additional block 9 transforms the time domain audio signal to frequency domain.
- the blocks 10-14 corresponds to the blocks 10-14 in figure 2.
- the result of the block 14 is a watermarked audio signal in the frequency domain.
- a further block 15 transforming the watermarked signal back to time domain is provided.
- the output from the block 15 is a watermarked audio signal A having the altered signal pair 1', r'.
- An advantage of the method described in the function flow diagram of figure 3 is that a frequency domain audio signal A may be analyzed in bands, preferably resembling the human auditory system, and the embedding EMB may be performed on each band. This will provide a higher watermarking capacity compared to the watermarking system operated in the time domain.
- Figure 4 is generalized functional flow diagram of the data flow in an alternative embodiment of the method for embedding a watermark into a stereo audio signal.
- the elements 9, 10, 12 and 14, 15 of figure 3 are also found in figure 4 and will not be described in further detail again.
- the elements 11 ' and 13' in figure 4 corresponds to the elements 11 and 13 in figure 3, but in figure 4 the block 11 ' is not a quantization but a mapping function Ml and block 13' is not a dequantization, but an inverse mapping operation M2, where the inverse mapping operation M2 is the inverse operation of the mapping function Ml .
- the mapping function Ml may be a quantization function without being limited to this type of function.
- the mapping in block 11 ' of the at least one original spatial image parameter HD and/or ICC by a mapping function Ml to obtain at least one mapped spatial image parameter ICQd x Ht is performed so that the perceptual effect of the mapping function is substantially linear.
- the embedding data or embedding signal ES to be embedded into the at least one mapped spatial image parameter ICQd x Ht to obtain at least one altered mapped spatial image parameter ICCidxfit+ICCdeita may be associated to a predetermined pattern.
- the watermark symbol 0 may be associated to the pattern [a, -a, a, -a, ...] whereas the watermark symbol 1 may be associated with [-a, a, -a, a, ...].
- embedding a sequence 0010 would result in twice the pattern [a, -a, a, -a, ...], followed by once [-a, a, -a, a, ...] and finally followed by [a, -a, a, -a, ...] again.
- the pattern could consist of a cyclically rotated (time varying) pattern in case of a pseudo random sequence. It is also to be noted that such patterns may be applied in the time or in the frequency direction (over different bands) or even in time and frequency direction.
- the value "a” depends on the domain in which the embedding takes place. In the index domain, i.e., going from one value (e.g. +3) to the next (e.g.
- +4) typically represents a Just Noticeable Difference (JND), and in this case "a” typically represents a value «1. In general, the variable "a” is smaller than the JND.
- an inverse mapping operation is performed on the at least one altered mapped spatial image parameter ICQdrfit+ICCdeita to obtain the at least one altered spatial image parameter HD', ICC, where the inverse mapping operation M2 is the inverse operation of the mapping function Ml.
- the altered audio signal A' is encoded using a legacy encoder in block 16 in order to provide an output bit stream BS.
- the data embedding for each symbol may be repeated for a number of N frames or in case of a pseudo random sequence, the pattern may be spread over a number of frames.
- the HD, ICC and ITD parameters are used to embed the watermark.
- a watermark could also be embedded in any parameter(s) that are determined from these parameters, or parameters that are determined from the covariance matrix C.
- watermark embedding can also be applied on the Eigen values (or their ratio) of the normalized covariance matrix C n .
- virtually any parametric representation of the covariance matrix could be a basis for watermark embedding.
- the use of HD, ICC and ITD has the advantage that the perceptual consequences are predictable and it allows for a robust watermark as well, as will be outlined in the next embodiment.
- Spectral smearing in general it is impossible to measure the stereo image parameters on perfectly rectangular bands as this would require infinitely long filter responses.
- Temporal smearing due to a necessary overlap of analysis/synthesis windows the stereo image cues get filtered (smeared) over time.
- the embedder takes into account the effects caused by temporal and spectral smearing and compensates for this. This typically results in an embedding with slightly higher strength. Hence, special care needs to be taken to prevent perceptual detectability.
- FIG. 5 is a functional flow diagram representing the data flow in an embodiment of a method of extracting a watermark from a watermarked audio signal according to the invention.
- a stereo signal A originating from a watermark embedder is analyzed in block
- the spatial image parameters HD, ICC are quantized to indices HD ⁇ x , ICQd x in block 22 and dequantized in block 23.
- the resulting quantized values are subtracted from the originally measured HD and ICC values in block 24.
- the resulting HD and ICC values mostly only contain the watermark data IIDddta, ICCdeita- Due to both temporal and spectral smearing, and any other processing applied to the signal, this watermark data can be degraded.
- the watermark data is fed to the embedding extractor block 25.
- Figure 6 is a functional flow diagram representing the data flow in an alternative embodiment of a method of extracting a watermark from a watermarked audio signal according to the invention.
- FIG. 6 A block diagram of the corresponding embedding detector is shown in figure 6.
- the input audio signal A is transferred from the time domain to the frequency domain in block 30.
- the frequency domain signal is subsequently analyzed in block 31 in order to derive the spatial image parameters HD, ICC.
- the linearized, floating point indexes ICQdX 1 Ht of the spatial image parameters HD, ICC are derived in block 32.
- a set of stereo parameters over the frequency bands may be derived.
- a (least-squares) coefficient can be derived as:
- the coefficient ⁇ is collected for a number of N frames.
- the mean of CC can be employed as a threshold for symbol detection.
- the coefficient ⁇ is gathered for a set of M frames, with M » N . Then all different offsets O ⁇ n ⁇ N are applied to determine the maximum contrast in the mean of CC over N frames.
- Figures 7 and 8 are flow charts representing the data flows in a method according to the invention for embedding a watermark into an audio signal, wherein the method comprises server precoding.
- the method disclosed in figures 7 and 8 may be combined with Parametric Stereo as an efficient server side embedding.
- Figure 7 represents the data flow in a robust server side embedding pre- encoding
- figure 8 represents the data flow in a robust embedding of a personalized watermark.
- the left/right signal pair 1, r of an input audio signal A is transformed from time domain to frequency domain.
- the resulting signal is subsequently decomposed in block 41, resulting in two substantially orthogonal signals yl, y2 as well as a set of stereo image parameters HD, ICC (ITD).
- ITD stereo image parameters
- the main signal component y 1 is transmitted to block 42 where it is transformed back to the time domain, and the resulting main signal component m is transmitted to block 43 and encoded using a legacy mono encoder LME in block 43.
- the resulting mono bit stream MBS is stored at a content server CS, block 45.
- the stereo image parameters HD, ICC are mapped to the linear perceptual domain by means of a mapping function Ll, in block 44.
- the mapped spatial image indices e.g. floating point spatial image indices are also stored on the content server CS, block 45.
- a request RQST is made, e.g. a user request made by a user, to the content server CS, block 45.
- the request may e.g. be a request for a specific song.
- the mono bit-stream MBS and floating point spatial image parameters ICC 1 CiX 1 Ht, corresponding to the song are derived from the content server CS.
- a personalized watermark PWM is constructed.
- Embedding E, block 46, is subsequently applied in a similar fashion to the embedding disclosed in relation to figure 4.
- FIG. 9 is a diagram of a device 100 for embedding and/or extracting a watermark into and/or from an audio signal.
- the device 100 may be a watermark embedder or a watermark detector/extractor.
- the device 100 comprises a transceiver 50, which may receive and/or transmit signals, a microprocessor 60 operable to perform calculations in order to embed or extract watermarks according to one or more embodiments of the method of the invention as well as a memory or storage means 70 for use in one or more of the embodiments of the methods of the invention.
- Stereo image is meant to denote the perceptual attributes that construct the spatial perception of a stereo signal.
- covariance matrix is a meant to denote a statistical description of multiple signals in the form of a matrix of correlations between the signals, i.e. statistical expectation operation. Thus, the covariance matrix describes the relations between two channels on a statistical level.
- spatial image parameters are meant to be synonymous to the terms “stereo parameters” and spatial cues, and includes the inter-channel intensity differences (HD), inter- channel correlation or coherence (ICC), the Inter-channel Phase Difference (IPD) and the Inter-channel Time Difference (ITD).
- spatial image parameters original spatial image parameters: HD, ICC, IPD, ITD quantized spatial image parameters: HD ⁇ x , ICQd x , IPD ⁇ x , ITD ⁇ x altered quantized spatial image parameters: HD ⁇ x + IIDdeita, ICQd x + ICC delta, IPD 1 O x + IPDdeita, ITD ldx + ITDdeita; altered mapped spatial image parameters: HD ⁇ x a+IIDdeita, ICQd x Ht + ICCdeita, IPD ⁇ x flt+IPDdelta, ITD ⁇ x fit + ITDdeita altered or potentially altered spatial image parameters: HD', ICC, IPD, ITD; IID' ldx , ICC' ldx ; IPD' ldx , ITD' ldx .
- the invention can be implemented by means of hardware, software, firmware or any combination of these.
- the invention or some of the features thereof can also be implemented as software running on one or more data processors and/or digital signal processors.
- the individual elements of an embodiment of the invention may be physically, functionally and logically implemented in any suitable way such as in a single unit, in a plurality of units or as part of separate functional units.
- the invention may be implemented in a single unit, or be both physically and functionally distributed between different units and processors.
- the invention relates to embedding a watermark into audio signals whilst taking account of the binaural aspects of the human auditory system.
- the invention presents methods of embedding a watermark into a stereo or multichannel audio signal.
- the methods of the invention imply alteration of spatial image parameters of a stereo or multichannel system, viz. the inter-channel intensity difference (IID) and/or the inter-channel correlation (ICC), optionally supplemented by alterations of the inter-channel phase difference (IPD) and/or the inter-channel time difference (ITD).
- IID inter-channel intensity difference
- IPD inter-channel phase difference
- ITD inter-channel time difference
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Editing Of Facsimile Originals (AREA)
- Image Processing (AREA)
Abstract
L'invention se rapporte à l'intégration d'un tatouage numérique dans des signaux audio tout en tenant compte des aspects binauraux du système auditif humain. L'invention présente des procédés consistant à intégrer un tatouage numérique dans un signal stéréo ou audio multicanal. Ces procédés fournissent un tatouage numérique robuste à utiliser avec divers systèmes de codage de signaux audio. Les procédés de l'invention impliquent une modification des paramètres d'image spatiaux d'un système stéréo ou multicanal, à savoir la différence d'intensité entre canaux (IID) et/ou la corrélation entre canaux (ICC), cette modification étant complétée éventuellement par des modifications de la différence de phase entre canaux (IPD) et/ou de la différence de temps entre canaux (ITD).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP08151924 | 2008-02-26 | ||
EP08151924.1 | 2008-02-26 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2009107054A1 true WO2009107054A1 (fr) | 2009-09-03 |
Family
ID=40666788
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2009/050726 WO2009107054A1 (fr) | 2008-02-26 | 2009-02-23 | Procédé d'intégration de données dans une image stéréo |
Country Status (2)
Country | Link |
---|---|
TW (1) | TW200945098A (fr) |
WO (1) | WO2009107054A1 (fr) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102592597A (zh) * | 2011-01-17 | 2012-07-18 | 鸿富锦精密工业(深圳)有限公司 | 电子装置及音频数据的版权保护方法 |
WO2014011487A1 (fr) * | 2012-07-12 | 2014-01-16 | Dolby Laboratories Licensing Corporation | Incorporation de données dans de l'audio stéréo au moyen d'une modulation de paramètre de saturation |
WO2014164138A1 (fr) * | 2013-03-11 | 2014-10-09 | The Nielsen Company (Us), Llc | Compensation de mixage réducteur pour mise en place d'un filigrane audio |
US9165559B2 (en) | 2011-08-23 | 2015-10-20 | Peter Georg Baum | Method and apparatus for frequency domain watermark processing a multi-channel audio signal in real-time |
US9191516B2 (en) | 2013-02-20 | 2015-11-17 | Qualcomm Incorporated | Teleconferencing using steganographically-embedded audio data |
CN106033671A (zh) * | 2015-03-09 | 2016-10-19 | 华为技术有限公司 | 确定声道间时间差参数的方法和装置 |
WO2019037714A1 (fr) * | 2017-08-23 | 2019-02-28 | 华为技术有限公司 | Procédé de codage et appareil de codage pour signal stéréo |
EP3799045A1 (fr) * | 2019-09-30 | 2021-03-31 | Spotify AB | Systèmes et procédés pour intégrer des données dans un contenu multimédia |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI514849B (zh) * | 2012-01-11 | 2015-12-21 | Himax Tech Ltd | 立體影像顯示系統之校正裝置及其校正方法 |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002029808A2 (fr) * | 2000-10-04 | 2002-04-11 | University Of Miami | Masquage de voie auxiliaire dans un signal audio |
-
2009
- 2009-02-23 TW TW098105679A patent/TW200945098A/zh unknown
- 2009-02-23 WO PCT/IB2009/050726 patent/WO2009107054A1/fr active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2002029808A2 (fr) * | 2000-10-04 | 2002-04-11 | University Of Miami | Masquage de voie auxiliaire dans un signal audio |
Non-Patent Citations (1)
Title |
---|
BREEBAART J ET AL: "Parametric Coding of Stereo Audio", INTERNET CITATION, no. 2005:9, 1 June 2005 (2005-06-01), pages 1305 - 1322, XP002514252, ISSN: 1110-8657, Retrieved from the Internet <URL:http://www.jeroenbreebaart.com/papers/jasp/jasp2005.pdf> [retrieved on 20090210] * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102592597A (zh) * | 2011-01-17 | 2012-07-18 | 鸿富锦精密工业(深圳)有限公司 | 电子装置及音频数据的版权保护方法 |
CN102592597B (zh) * | 2011-01-17 | 2014-08-13 | 鸿富锦精密工业(深圳)有限公司 | 电子装置及音频数据的版权保护方法 |
US9196259B2 (en) | 2011-01-17 | 2015-11-24 | Hon Hai Precision Industry Co., Ltd. | Electronic device and copyright protection method of audio data thereof |
US9165559B2 (en) | 2011-08-23 | 2015-10-20 | Peter Georg Baum | Method and apparatus for frequency domain watermark processing a multi-channel audio signal in real-time |
US9357326B2 (en) | 2012-07-12 | 2016-05-31 | Dolby Laboratories Licensing Corporation | Embedding data in stereo audio using saturation parameter modulation |
WO2014011487A1 (fr) * | 2012-07-12 | 2014-01-16 | Dolby Laboratories Licensing Corporation | Incorporation de données dans de l'audio stéréo au moyen d'une modulation de paramètre de saturation |
US20150163614A1 (en) * | 2012-07-12 | 2015-06-11 | Dolby Laboratories Licensing Corporation | Embedding data in stereo audio using saturation parameter modulation |
CN104488026A (zh) * | 2012-07-12 | 2015-04-01 | 杜比实验室特许公司 | 使用饱和参数调制将数据嵌入立体声音频中 |
US9191516B2 (en) | 2013-02-20 | 2015-11-17 | Qualcomm Incorporated | Teleconferencing using steganographically-embedded audio data |
US9704494B2 (en) | 2013-03-11 | 2017-07-11 | The Nielsen Company (Us), Llc | Down-mixing compensation for audio watermarking |
WO2014164138A1 (fr) * | 2013-03-11 | 2014-10-09 | The Nielsen Company (Us), Llc | Compensation de mixage réducteur pour mise en place d'un filigrane audio |
US9514760B2 (en) | 2013-03-11 | 2016-12-06 | The Nielsen Company (Us), Llc | Down-mixing compensation for audio watermarking |
US9093064B2 (en) | 2013-03-11 | 2015-07-28 | The Nielsen Company (Us), Llc | Down-mixing compensation for audio watermarking |
CN106033671A (zh) * | 2015-03-09 | 2016-10-19 | 华为技术有限公司 | 确定声道间时间差参数的方法和装置 |
CN106033671B (zh) * | 2015-03-09 | 2020-11-06 | 华为技术有限公司 | 确定声道间时间差参数的方法和装置 |
WO2019037714A1 (fr) * | 2017-08-23 | 2019-02-28 | 华为技术有限公司 | Procédé de codage et appareil de codage pour signal stéréo |
CN109427338A (zh) * | 2017-08-23 | 2019-03-05 | 华为技术有限公司 | 立体声信号的编码方法和编码装置 |
CN109427338B (zh) * | 2017-08-23 | 2021-03-30 | 华为技术有限公司 | 立体声信号的编码方法和编码装置 |
US11244691B2 (en) | 2017-08-23 | 2022-02-08 | Huawei Technologies Co., Ltd. | Stereo signal encoding method and encoding apparatus |
US11636863B2 (en) | 2017-08-23 | 2023-04-25 | Huawei Technologies Co., Ltd. | Stereo signal encoding method and encoding apparatus |
EP3799045A1 (fr) * | 2019-09-30 | 2021-03-31 | Spotify AB | Systèmes et procédés pour intégrer des données dans un contenu multimédia |
US11545122B2 (en) | 2019-09-30 | 2023-01-03 | Spotify Ab | Systems and methods for embedding data in media content |
Also Published As
Publication number | Publication date |
---|---|
TW200945098A (en) | 2009-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2009107054A1 (fr) | Procédé d'intégration de données dans une image stéréo | |
JP7161564B2 (ja) | チャネル間時間差を推定する装置及び方法 | |
CN100589657C (zh) | 编码音频的节约式响度测量方法及装置 | |
KR100898879B1 (ko) | 부수 정보에 응답하여 하나 또는 그 이상의 파라메터를변조하는 오디오 또는 비디오 지각 코딩 시스템 | |
KR101428487B1 (ko) | 멀티 채널 부호화 및 복호화 방법 및 장치 | |
JP2021140170A (ja) | 無相関化信号の寄与の残差信号ベースの調整を用いたマルチチャンネルオーディオデコーダ、マルチチャンネルオーディオエンコーダ、方法およびコンピュータプログラム | |
TWI714046B (zh) | 用於估計聲道間時間差的裝置、方法或計算機程式 | |
RU2600527C1 (ru) | Система компандирования и способ для снижения шума квантования с использованием усовершенствованного спектрального расширения | |
RU2628898C1 (ru) | Неравномерное квантование параметров для усовершенствованной связи | |
KR20160111042A (ko) | 스테레오 오디오 인코더 및 디코더 | |
JP2022084671A (ja) | マルチチャネル信号符号化方法、マルチチャネル信号復号化方法、符号器、及び復号器 | |
KR20120038311A (ko) | 공간 파라미터 부호화 장치 및 방법,그리고 공간 파라미터 복호화 장치 및 방법 | |
Bazyar et al. | A New MPEG Layer III Steganography Technique By Changing Quantized Spectrum Values | |
George et al. | Low Power Stereo Perceptual Audio Coding Based on Adaptive Masking Threshold Reuse | |
IL165648A (en) | An audio coding system that uses decoded signal properties to coordinate synthesized spectral components |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 09713781 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 09713781 Country of ref document: EP Kind code of ref document: A1 |