EP1639518B1 - Methods and apparatus for embedding watermarks - Google Patents
Methods and apparatus for embedding watermarks Download PDFInfo
- Publication number
- EP1639518B1 EP1639518B1 EP04776572.2A EP04776572A EP1639518B1 EP 1639518 B1 EP1639518 B1 EP 1639518B1 EP 04776572 A EP04776572 A EP 04776572A EP 1639518 B1 EP1639518 B1 EP 1639518B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- watermarked
- data stream
- transform coefficient
- audio block
- digital data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000000034 method Methods 0.000 title claims description 90
- 238000007906 compression Methods 0.000 claims description 61
- 230000006835 compression Effects 0.000 claims description 58
- 230000000873 masking effect Effects 0.000 claims description 14
- 238000012857 repacking Methods 0.000 claims description 10
- 230000003287 optical effect Effects 0.000 claims description 2
- 239000003607 modifier Substances 0.000 claims 9
- 238000012986 modification Methods 0.000 description 50
- 230000004048 modification Effects 0.000 description 50
- 230000008569 process Effects 0.000 description 33
- 230000005236 sound signal Effects 0.000 description 15
- 238000013139 quantization Methods 0.000 description 14
- 230000003595 spectral effect Effects 0.000 description 10
- 230000006837 decompression Effects 0.000 description 9
- 238000012545 processing Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 230000006870 function Effects 0.000 description 7
- 230000003190 augmentative effect Effects 0.000 description 5
- 230000005540 biological transmission Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 5
- 238000010606 normalization Methods 0.000 description 4
- 230000006399 behavior Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 3
- 238000013480 data collection Methods 0.000 description 3
- 238000005259 measurement Methods 0.000 description 3
- 230000002238 attenuated effect Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 230000005574 cross-species transmission Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 101100102331 Arabidopsis thaliana UXS2 gene Proteins 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000003416 augmentation Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- JLGLQAWTXXGVEM-UHFFFAOYSA-N triethylene glycol monomethyl ether Chemical compound COCCOCCOCCO JLGLQAWTXXGVEM-UHFFFAOYSA-N 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04K—SECRET COMMUNICATION; JAMMING OF COMMUNICATION
- H04K1/00—Secret communication
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/018—Audio watermarking, i.e. embedding inaudible data in the audio signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H20/00—Arrangements for broadcast or for distribution combined with broadcast
- H04H20/28—Arrangements for simultaneous broadcast of plural pieces of information
- H04H20/30—Arrangements for simultaneous broadcast of plural pieces of information by a single channel
- H04H20/31—Arrangements for simultaneous broadcast of plural pieces of information by a single channel using in-band signals, e.g. subsonic or cue signal
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/02—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
- G10L19/0212—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders using orthogonal transformation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H2201/00—Aspects of broadcast communication
- H04H2201/50—Aspects of broadcast communication characterised by the use of watermarks
Definitions
- the present disclosure relates generally to media measurements, and more particularly, to methods and apparatus for embedding watermarks in a compressed digital data stream.
- AC-3 Digital Audio Compression
- AC-3 Digital Audio Compression
- AC-3 Digital Audio Compression
- the AC-3 compression standard is based on a perceptual digital audio coding technique that reduces the amount of data needed to reproduce the original audio signal while minimizing perceptible distortion.
- the AC-3 compression standard recognizes that the human ear is unable to perceive (DVDs), digital cable, and satellite transmissions that enables the broadcast of special sound effects (e.g., surround sound).
- MPEG Moving Picture Experts Group
- AC-3 Digital Audio Compression
- AC-3 compression standard is based on a perceptual digital audio coding technique that reduces the amount of data needed to reproduce the original audio signal while minimizing perceptible distortion.
- the AC-3 compression standard recognizes that the human ear is unable to perceive (DVDs), digital cable, and satellite transmissions that enables the broadcast of special sound effects (e.g., surround sound).
- Watermarking techniques to embed watermarks within video and/or audio data streams compressed in accordance with compression standards such as the AC-3 compression standard and the MPEG Advanced Audio Coding (AAC) compression standard.
- compression standards such as the AC-3 compression standard and the MPEG Advanced Audio Coding (AAC) compression standard.
- watermarks are digital data that uniquely identify broadcasters and/or programs. Watermarks are typically extracted using a decoding operation at one or more reception sites (e.g., households or other media consumption sites) and, thus, may be used to assess the viewing behaviors of individual households and/or groups of households to produce ratings information.
- existing watermarking techniques are designed for use with analog broadcast systems.
- existing watermarking techniques convert analog program data to an uncompressed digital data stream, insert watermark data in the uncompressed digital data stream, and convert the watermarked data stream to an analog format prior to transmission.
- watermark data may need to be embedded or inserted directly in a compressed digital data stream.
- Existing watermarking techniques may decompress the compressed digital data stream into time-domain samples, insert the watermark data into the time-domain samples, and recompress the watermarked time-domain samples into a watermarked compressed digital data stream.
- Such decompression/compression may cause degradation in the quality of the media content in the compressed digital data stream.
- existing decompression/compression techniques require additional equipment and cause delay of the audio component of a broadcast in a manner that, in some cases, may be unacceptable.
- the methods employed by local broadcasting affiliates to receive compressed digital data streams from their parent networks and to insert local content through sophisticated splicing equipment prevent conversion of a compressed digital data stream to a time-domain (uncompressed) signal prior to recompression of the digital data streams.
- US 6,611,607 B1 discloses to identifying a transform coefficient associated with a compressed digital data stream and to modify the transform coefficient to embed the watermark.
- US 6,505,223 B1 discloses to obtain a mantissa code.
- EP 1 104 969 A1 discloses a prior method and apparatus for encoding/decoding and watermarking a data stream. There, mantissas are used to insert a watermark in an uncompressed data stream, which is either a raw input data stream (especially an audio data stream) or a data stream that has been uncompressed prior to watermark insertion.
- WO 97/21293 A1 and WO 03/017254 A1 each describe modifying mantissas to add data to a compressed audio data stream.
- the mantissas of sub-band samples of a compressed audio data stream are modified directly through normalization, addition, and de-normalization to add data to the compressed audio data stream.
- WO 03/017254 A1 least significant bits (LSBs) of mantissa data of a compressed digital audio are overwritten to add data to a compressed digital audio.
- LSBs least significant bits
- methods and apparatus for embedding watermarks in compressed digital data streams are disclosed herein.
- the methods and apparatus disclosed herein may be used to embed watermarks in compressed digital data streams without prior decompression of the compressed digital data streams.
- the methods and apparatus disclosed herein eliminate the need to subject compressed digital data streams to multiple decompression/compression cycles, which are typically unacceptable to, for example, affiliates of television broadcast networks because multiple decompression/compression cycles may significantly degrade the quality of media content in the compressed digital data streams.
- the methods and apparatus disclosed herein may be used to unpack the modified discrete cosine transform (MDCT) coefficient sets associated with a compressed digital data stream formatted according to a digital audio compression standard such as the AC-3 compression standard.
- MDCT modified discrete cosine transform
- the mantissas of the unpacked MDCT coefficient sets may be modified to embed watermarks that imperceptibly augment the compressed digital data stream.
- a receiving device e.g., a set top television metering device at a media consumption site
- the extracted watermark information may be used to identify the media sources and/or programs (e.g., broadcast stations) associated with media currently being consumed (e.g., viewed, listened to, etc.) at a media consumption site.
- the source and program identification information may be used in known manners to generate ratings information and/or any other information that may be used to assess the viewing behaviors associated with individual households and/or groups of households.
- an example broadcast system 100 including a service provider 110, a television 120, a remote control device 125, and a receiving device 130 is metered using an audience measurement system.
- the components of the broadcast system 100 may be coupled in any well-known manner.
- the television 120 is positioned in a viewing area 150 located within a household occupied by one or more people, referred to as household members 160, some or all of whom have agreed to participate in an audience measurement research study.
- the receiving device 130 may be a set top box (STB), a video cassette recorder, a digital video recorder, a personal video recorder, a personal computer, a digital video disc player, etc. coupled to the television 120.
- the viewing area 150 includes the area in which the television 120 is located and from which the television 120 may be viewed by the one or more household members 160 located in the viewing area 150.
- a metering device 140 is configured to identify viewing information based on video/audio output signals conveyed from the receiving device 130 to the television 120.
- the metering device 140 provides this viewing information as well as other tuning and/or demographic data via a network 170 to a data collection facility 180.
- the network 170 may be implemented using any desired combination of hardwired and wireless communication links, including for example, the Internet, an Ethernet connection, a digital subscriber line (DSL), a telephone line, a cellular telephone system, a coaxial cable, etc.
- the data collection facility 180 may be configured to process and/or store data received from the metering device 140 to produce ratings information.
- the service provider 110 may be implemented by any service provider such as, for example, a cable television service provider 112, a radio frequency (RF) television service provider 114, and/or a satellite television service provider 116.
- the television 120 receives a plurality of television signals transmitted via a plurality of channels by the service provider 110 and may be adapted to process and display television signals provided in any format such as a National Television Standards Committee (NTSC) television signal format, a high definition television (HDTV) signal format, an Advanced Television Systems Committee (ATSC) television signal format, a phase alternation line (PAL) television signal format, a digital video broadcasting (DVB) television signal format, an Association of Radio Industries and Businesses (ARIB) television signal format, etc.
- NSC National Television Standards Committee
- HDTV high definition television
- ATSC Advanced Television Systems Committee
- PAL phase alternation line
- DVD digital video broadcasting
- ARIB Association of Radio Industries and Businesses
- the user-operated remote control device 125 allows a user (e.g., the household member 160) to cause the television 120 to tune to and receive signals transmitted on a desired channel, and to cause the television 120 to process and present or deliver the programming or media content contained in the signals transmitted on the desired channel.
- the processing performed by the television 120 may include, for example, extracting a video and/or an audio component delivered via the received signal, causing the video component to be displayed on a screen/display associated with the television 120, and causing the audio component to be emitted by speakers associated with the television 120.
- the programming content contained in the television signal may include, for example, a television program, a movie, an advertisement, a video game, a web page, a still image, and/or a preview of other programming content that is currently offered or will be offered in the future by the service provider 110.
- While the components shown in FIG. 1 are depicted as separate structures within the broadcast system 100, the functions performed by some of these structures may be integrated within a single unit or may be implemented using two or more separate components.
- the television 120 and the receiving device 130 are depicted as separate structures, the television 120 and the receiving device 130 may be integrated into a single unit (e.g., an integrated digital television set).
- the television 120, the receiving device 130, and/or the metering device 140 may be integrated into a single unit.
- a watermark embedding system may encode watermarks that uniquely identify broadcasters and/or programs in the broadcast signals from the service providers 110.
- the watermark embedding system may be implemented at the service provider 110 so that each of the plurality of media signals (e.g., television signals) transmitted by the service provider 110 includes one or more watermarks.
- the receiving device 130 may tune to and receive media signals transmitted on a desired channel and cause the television 120 to process and present the programming content contained in the signals transmitted on the desired channel.
- the metering device 140 may identify watermark information based on video/audio output signals conveyed from the receiving device 130 to the television 120. Accordingly, the metering device 140 may provide this watermark information as well as other tuning and/or demographic data to the data collection facility 180 via the network 170.
- an example watermark embedding system 200 includes an embedding device 210 and a watermark source 220.
- the embedding device 210 is configured to insert watermark information 230 from the watermark source 220 into a compressed digital data stream 240.
- the compressed digital data stream 240 may be compressed according to an audio compression standard such as the AC-3 compression standard and/or the MPEG-AAC compression standard, either of which may be used to process blocks of an audio signal using a predetermined number of digitized samples from each block.
- the source of the compressed digital data stream 240 (not shown) may be sampled at a rate of, for example, 48 kilohertz (kHz) to form audio blocks as described below.
- audio compression techniques such as those based on the AC-3 compression standard use overlapped audio blocks and the MDCT algorithm to convert an audio signal into a compressed digital data stream (e.g., the compressed digital data stream 240 of FIG. 2 ).
- Two different block sizes i.e., short and long blocks
- AC-3 short blocks may be used to minimize pre-echo for transient segments of the audio signal
- AC-3 long blocks may be used to achieve high compression gain for non-transient segments of the audio signal.
- an AC-3 long block corresponds to a block of 512 time-domain audio samples
- an AC-3 short block corresponds to 256 time-domain audio samples.
- the 512 time-domain samples are obtained by concatenating a preceding (old) block of 256 time-domain samples and a current (new) block of 256 time-domain samples to create an audio block of 512 time-domain samples.
- the AC-3 long block is then transformed using the MDCT algorithm to generate 256 transform coefficients.
- an AC-3 short block is similarly obtained from a pair of consecutive time-domain sample blocks of audio.
- the AC-3 short block is then transformed using the MDCT algorithm to generate 128 transform coefficients.
- the 128 transform coefficients corresponding to two adjacent short blocks are then interleaved to generate a set of 256 transform coefficients.
- processing of either AC-3 long or AC-3 short blocks results in the same number of MDCT coefficients.
- a short block contains 128 samples and a long block contains 1024 samples.
- an uncompressed digital data stream 300 includes a plurality of 256-sample time-domain audio blocks 310, generally shown as A0, A1, A2, A3, A4, and A5.
- the MDCT algorithm processes the audio blocks 310 to generate MDCT coefficient sets 320, shown by way of example as MA0, MA1, MA2, MA3, MA4, and MA5 (where MA5 is not shown).
- the MDCT algorithm may process the audio blocks A0 and A1 to generate the MDCT coefficient set MA0.
- the audio blocks A0 and A1 are concatenated to generate a 512-sample audio block (e.g., an AC-3 long block) that is MDCT transformed using the MDCT algorithm to generate the MDCT coefficient set MA0 which includes 256 MDCT coefficients.
- the audio blocks A1 and A2 may be processed to generate the MDCT coefficient set MA1.
- the audio block A1 is an overlapping audio block because it is used to generate both MDCT coefficient sets MA0 and MA1.
- the MDCT algorithm is used to transform the audio blocks A2 and A3 to generate the MDCT coefficient set MA2, the audio blocks A3 and A4 to generate the MDCT coefficient set MA3, the audio blocks A4 and A5 to generate the MDCT coefficient set MA4, etc.
- the audio block A2 is an overlapping audio block used to generate the MDCT coefficient sets MA1 and MA2
- the audio block A3 is an overlapping audio block used to generate the MDCT coefficient sets MA2 and MA3
- the audio block A4 is an overlapping audio block used to generate the MDCT coefficient sets MA3 and MA4, etc.
- the MDCT coefficient sets 320 form the compressed digital data stream 240.
- the embedding device 210 of FIG. 2 may embed or insert the watermark information or watermark 230 from the watermark source 220 into the compressed digital data stream 240.
- the watermark 230 may be used, for example, to uniquely identify broadcasters and/or programs so that media consumption information (e.g., viewing information) and/or ratings information may be produced. Accordingly, the embedding device 210 produces a watermarked compressed digital data stream 250 for transmission.
- the embedding device 210 includes an identifying unit 410, an unpacking unit 420, a modification unit 430, and a repacking unit 440. While the operation of the embedding device 210 is described below in accordance with the AC-3 compression standard, the embedding device 210 may be implemented to operate with additional or other compression standards such as, for example, the MPEG-AAC compression standard. The operation of the embedding device 210 is described in greater detail in connection with FIG. 5 .
- the identifying unit 410 is configured to identify one or more frames 510 associated with the compressed digital data stream 240, a portion of which is shown by way of example as Frame A and Frame B in FIG. 5 .
- the compressed digital data stream 240 may be a digital data stream compressed in accordance with the AC-3 standard (hereinafter ⁇ AC-3 data stream"). While the AC-3 data stream 240 may include multiple channels, for purposes of clarity, the following example describes the AC-3 data stream 240 as including only one channel.
- each of the frames 510 includes a plurality of MDCT coefficient sets 520.
- each of the frames 510 includes six MDCT coefficient sets (i.e., six ⁇ audblk").
- Frame A includes the MDCT coefficient sets MA0, MA1, MA2, MA3, MA4 and MA5
- Frame B includes the MDCT coefficient sets MB0, MB1, MB2, MB3, MB4 and MB5.
- the identifying unit 410 is also configured to identify header information associated with each of the frames 510, such as, for example, the number of channels associated with the AC-3 data stream 240. While the example AC-3 data stream 240 includes only one channel as noted above, an example compressed digital data stream having multiple channels is described below in connection with FIGS. 7 and 8 .
- the unpacking unit 420 is configured to unpack the MDCT coefficient sets 520 to determine compression information such as, for example, the parameters of the original compression process (i.e., the manner in which an audio compression technique compressed an audio signal or audio data to form the compressed digital data stream 240). For example, the unpacking unit 420 may determine how many bits are used to represent each of the MDCT coefficients within the MDCT coefficient sets 520. Additionally, compression parameters may include information that limits the extent to which the AC-3 data stream 240 may be modified to ensure that the media content conveyed via the AC-3 data stream 240 is of a sufficiently high quality level.
- the embedding device 210 subsequently uses the compression information identified by the unpacking unit 420 to embed/insert the desired watermark information 230 into the AC-3 data stream 240 thereby ensuring that the watermark insertion is performed in a manner consistent with the compression information supplied in the signal.
- the compression information also includes a mantissa and an exponent associated with each MDCT coefficient.
- the AC-3 compression standard employs techniques to reduce the number of bits used to represent each MDCT coefficient.
- Psycho-acoustic masking is one factor that may be utilized by these techniques. For example, the presence of audio energy E k either at a particular frequency k (e.g., a tone) or spread across a band of frequencies proximate to the particular frequency k (e.g., a noise-like characteristic) creates a masking effect.
- the number of bits used to represent the mantissa M k of each MDCT coefficient of the MDCT coefficient sets 520 may be determined based on known quantization look-up tables published in the AC-3 compression standard (e.g., the quantization look-up table 600 of FIG. 6 ).
- the quantization look-up table 600 provides mantissa codes or bit patterns and corresponding mantissa values for MDCT coefficients represented by a four-bit number.
- the mantissa M k may be changed (e.g., augmented) to represent a modified value of an MDCT coefficient to embed a watermark in the AC-3 data stream 240.
- the modification unit 430 is configured to perform an inverse transform of each of the MDCT coefficient sets 520 to generate time-domain audio blocks 530, shown by way of example as TA0', TA3", TA4', TA4", TA5', TA5", TB0', TB0", TB1', TB1", and TB5' (TA0" through TA3' and TB2' through TB4" are not shown).
- the modification unit 430 performs inverse transform operations to generate sets of previous (old) time-domain audio blocks (which are represented as prime blocks) and sets of current (new) time-domain audio blocks (which are represented as double-prime blocks) associated with the 256-sample time-domain audio blocks that were concatenated to form the MDCT coefficient sets 520 of the AC-3 data stream 240.
- the modification unit 430 performs an inverse transform on the MDCT coefficient set MA5 to generate time-domain blocks TA4" and TA5', the MDCT coefficient set MB0 to generate TA5 " and TB0', the MDCT coefficient set MB1 to generate TB0" and TB1', etc.
- the modification unit 430 generates reconstructed time-domain audio blocks 540, which provide a reconstruction of the original time-domain audio blocks that were compressed to form the AC-3 data stream 240.
- the modification unit 430 may add time-domain audio blocks based on, for example, the known Princen-Bradley time domain alias cancellation (TDAC) technique as described in Princen et al., Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation, Institute of Electrical and Electronics Engineers (IEEE) Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-35, No. 5, pp. 1153 - 1161 (1996 ).
- TDAC Princen-Bradley time domain alias cancellation
- the modification unit 430 may reconstruct the time-domain audio block TA5 (i.e., TA5R) by adding the prime time-domain audio block TA5' and the double-prime time-domain audio block TA5" using the Princen-Bradley TDAC technique.
- the modification unit 430 may reconstruct the time-domain audio block TB0 (i.e., TB0R) by adding the prime audio block TB0' and the double-prime audio block TB0" using the Princen-Bradley TDAC technique. In this manner, the original time-domain audio blocks used to form the AC-3 data stream 240 are reconstructed to enable the watermark 230 to be embedded or inserted directly into the AC-3 data stream 240.
- the modification unit 430 is also configured to insert the watermark 230 into the reconstructed time-domain audio blocks 540 to generate watermarked time-domain audio blocks 550, shown by way of example as TA0W, TA4W, TA5W, TB0W, TB1W and TB5W (blocks TA1W, TA2W, TA3W, TB2W, TB3W and TB4W are not shown).
- the modification unit 430 generates a modifiable time-domain audio block by concatenating two adjacent reconstructed time-domain audio blocks to create a 512-sample audio block.
- the modification unit 430 may concatenate the reconstructed time-domain audio blocks TA5R and TB0R (each being a 256-sample audio block) to form a 512-sample audio block.
- the modification unit 430 may then insert the watermark 230 into the 512-sample audio block formed by the reconstructed time-domain audio blocks TA5R and TB0R to generate the watermarked time-domain audio blocks TA5W and TB0W.
- Encoding processes such as those described in U.S. Patent Nos. 6,272,176 , 6,504,870 , and 6,621,881 may be used to insert the watermark 230 into the reconstructed time-domain audio blocks 540.
- the disclosures of U.S. Patent Nos. 6,272,176 , 6,504,870 , and 6,621,881 are hereby incorporated by reference herein in their entireties.
- watermarks may be inserted into a 512-sample audio block.
- each 512-sample audio block carries one bit of embedded or inserted data of the watermark 230.
- spectral frequency components with indices f 1 and f 2 may be modified or augmented to insert data bits associated with the watermark 230.
- a power at the first spectral frequency associated with the index f 1 may be increased or augmented to be a spectral power maximum within a frequency neighborhood (e.g., a frequency neighborhood defined by the indices f 1 - 2, f 1 N 1, f 1 , f 1 + 1, and f 1 + 2).
- the power at the second spectral frequency associated with the index f 2 is attenuated or augmented to be a spectral power minimum within a frequency neighborhood (e.g., a frequency neighborhood defined by the indices f 2 N 2, f 2 N1, f 2 , f 2 + 1, and f 2 + 2).
- the power at the first spectral frequency associated with the index f 1 is attenuated to be a local spectral power minimum while the power at the second spectral frequency associated with the index f 2 is increased to a local spectral power maximum.
- the modification unit 430 based on the watermarked time-domain audio blocks 550, the modification unit 430 generates watermarked MDCT coefficient sets 560, shown by way of example as MA0W, MA4W, MA5W, MB0W and MB5W (blocks MA1W, MA2W, MA3W, MB1W, MB2W, MB3W and MB4W are not shown).
- the modification unit 430 generates the watermarked MDCT coefficient set MA5W based on the watermarked time-domain audio blocks TA5W and TB0W.
- the modification unit 430 concatenates the watermarked time-domain audio blocks TA5W and TB0W to form a 512-sample audio block and converts the 512-sample audio block into the watermarked MDCT coefficient set MA5W which, as described in greater detail below, may be used to modify the original MDCT coefficient set MA5.
- the difference between the MDCT coefficient sets 520 and the watermarked MDCT coefficient sets 560 represents a change in the AC-3 data stream 240 as a result of embedding or inserting the watermark 230.
- the modification unit 430 may modify the mantissa values in the MDCT coefficient set MA5 based on the differences between the coefficients in the corresponding watermarked MDCT coefficient set MA5W and the coefficients in the original MDCT coefficient set MA5.
- Quantization look-up tables e.g., the look-up table 600 of FIG.
- the new mantissa values represent the change in or augmentation of the AC-3 data stream 240 as a result of embedding or inserting the watermark 230. It is important to note that, in this example implementation, the exponents of the MDCT coefficients are not changed. Changing the exponents might require that the underlying compressed signal representation be recomputed, thereby requiring the compressed signal to undergo a true decompression/compression cycle.
- the affected MDCT mantissa is set to a maximum or minimum value, as appropriate.
- the redundancy included in the watermarking process allows the correct watermark to be decoded in the presence of such an encoding restriction.
- the example quantization look-up table 600 includes mantissa codes and mantissa values for a fifteen-level quantization of an example mantissa M k in the range of -0.9333 to +0.9333. While the example quantization look-up table 600 provides mantissa information associated with MDCT coefficients that are represented using four bits, the AC-3 compression standard provides quantization look-up tables associated with other suitable numbers of bits per MDCT coefficient. To illustrate one manner in which the modification unit 430 may modify a particular MDCT coefficient m k with a mantissa M k contained in the MDCT coefficient set MA5, assume the original mantissa value is -0.2666 (i.e., -4/15).
- the mantissa code corresponding to the particular MDCT coefficient m k in the MDCT coefficient set MA5 is determined to be 0101.
- the watermarked MDCT coefficient set MA5W includes a watermarked MDCT coefficient wm k with a mantissa value WM k. .
- the new mantissa value of the corresponding watermarked MDCT coefficient wm k of the watermarked MDCT coefficient set MA5W is -0.4300, which lies between the mantissa codes of 0011 and 0100.
- the watermark 230 results in a difference of -0.1667 between the original mantissa value of -0.2666 and the watermarked mantissa value of -0.4300.
- the modification unit 430 may use the watermarked MDCT coefficient set MA5W to modify or augment the MDCT coefficients in the MDCT coefficient set MA5.
- mantissa code 0011 or mantissa code 0100 may replace the mantissa code 0101 associated with the MDCT coefficient m k because the watermarked mantissa WM k associated with the corresponding watermarked MDCT coefficient wm k lies between the mantissa codes of 0011 and 0100 (because the mantissa value corresponding to the watermarked MDCT coefficient wm k is -0.4300).
- the mantissa value corresponding to the mantissa code 0011 is -0.5333 (i.e., -8/15) and the mantissa value corresponding to the mantissa code 0100 is -0.4 (i.e., -6/15).
- the modification unit 430 selects the mantissa code 0100 instead of the mantissa code 0011 to replace the original mantissa code 0101 associated with the MDCT coefficient m k because the mantissa value -0.4 corresponding to the mantissa code 0100 is closest to the desired watermark mantissa value -0.4300.
- each of the MDCT coefficients in the MDCT coefficient set MA5 may be modified in the manner described above. If a watermarked mantissa value is outside the quantization range of mantissa values (i.e., greater than 0.9333 or less than -0.9333), either the positive limit of 1110 or the negative limit of 0000 is selected as the new mantissa code, as appropriate. Additionally, and as discussed above, while the mantissa codes associated with each MDCT coefficient of an MDCT coefficient set may be modified as described above, the exponents associated with the MDCT coefficients remain unchanged.
- the repacking unit 440 is configured to repack the watermarked MDCT coefficient sets 560 associated with each frame of the AC-3 data stream 240 for transmission.
- the repacking unit 440 identifies the position of each MDCT coefficient set within a frame of the AC-3 data stream 240 so that the corresponding watermarked MDCT coefficient set can be used to modify the MDCT coefficient set.
- the repacking unit 440 may identify the position of and modify the MDCT coefficient sets MA0 to MA5 based on the corresponding watermarked MDCT coefficient sets MA0W to MA5W in the corresponding identified positions.
- the AC-3 data stream 240 remains a compressed digital data stream while the watermark 230 is embedded or inserted in the AC-3 data stream 240.
- the embedding device 210 inserts the watermark 230 into the AC-3 data stream 240 without additional decompression/compression cycles that may degrade the quality of the media content in the AC-3 data stream 240.
- an uncompressed digital data stream 700 may include a plurality of audio block sets 710.
- Each of the audio block sets 710 may include audio blocks associated with multiple channels 720 and 730 including, for example, a front left channel, a front right channel, a center channel, a surround left channel, a surround right channel, and a low-frequency effect (LFE) channel (e.g., a sub-woofer channel).
- LFE low-frequency effect
- the audio block set AUD0 includes an audio block A0L associated with the front left channel, an audio block A0R associated with the front right channel, an audio block A0C associated with the center channel, an audio block A0SL associated with the surround left channel, an audio block A0SR associated with the surround right channel, and an audio block A0LFE associated with the LFE channel.
- the audio block set AUD1 includes an audio block AIL associated with the front left channel, an audio block AIR associated with the front right channel, an audio block A1C associated with the center channel, an audio block A1SL associated with the surround left channel, an audio block A1SR associated with the surround right channel, and an audio block A1LFE associated with the LFE channel.
- Each of the audio blocks associated with a particular channel in the audio block sets 710 may be processed in a manner similar to that described above in connection with FIGS. 5 and 6 .
- the audio blocks associated with the center channel 810 of FIG. 8 shown by way of example as A0C, A1C, A2C, and A3C, may be transformed to generate the MDCT coefficient sets 820 associated with a compressed digital data stream 800.
- each of the MDCT coefficient sets 820 may be derived from a 512-sample audio block formed by concatenating a preceding (old) 256-sample audio block and a current (new) 256-sample audio block.
- the MDCT algorithm may then process the time-domain audio blocks 810 (e.g., A0C through A5C) to generate the MDCT coefficient sets (e.g., M0C through M5C).
- the identifying unit 410 Based on the MDCT coefficient sets 820 of the compressed digital data stream 800, the identifying unit 410 identifies a plurality of frames (not shown) and header information associated with each of the frames as described above.
- the header information includes compression information associated with the compressed digital data stream 800.
- the unpacking unit 420 For each of the frames, the unpacking unit 420 unpacks the MDCT coefficient sets 820 to determine the compression information associated with the MDCT coefficient sets 820. For example, the unpacking unit 420 may identify the number of bits used by the original compression process to represent the mantissa of each MDCT coefficient in each of the MDCT coefficient sets 820. Such compression information may be used to embed the watermark 230 as described above in connection with FIG. 6 .
- the modification unit 430 then generates inverse transformed time-domain audio blocks 830, shown by way of example as TA0C", TA1C', TA1C", TA2C', TA2C", and TA3C'.
- the time-domain audio blocks 830 include a set of previous (old) time-domain audio blocks (which are represented as prime blocks) and a set of current (new) time-domain audio blocks (which are represented as double-prime blocks).
- original time-domain audio blocks compressed to form the AC-3 digital data stream 800 may be reconstructed (i.e., the reconstructed time-domain audio blocks 840).
- the modification unit 430 may add the time-domain audio blocks TA1C' and TA1C" to reconstruct the time-domain audio block TA1C (i.e., TA1CR).
- the modification unit 430 may add the time-domain audio blocks TA2C' and TA2C" to reconstruct the time-domain audio block TA2C (i.e., TA2CR).
- the modification unit 430 concatenates two adjacent reconstructed time-domain audio blocks to create a 512-sample audio block (i.e., a modifiable time-domain audio block).
- the modification unit 430 may concatenate the reconstructed time-domain audio blocks TA1CR and TA2CR, each of which is a 256-sample short block, to form a 512-sample audio block.
- the modification unit 430 then inserts the watermark 230 into the 512-sample audio block formed by the reconstructed time-domain audio blocks TA1CR and TA2CR to generate the watermarked time-domain audio blocks TA1CW and TA2CW.
- the modification unit 430 may generate the watermarked MDCT coefficient sets 860. For example, the modification unit 430 may concatenate the watermarked time-domain audio blocks TA1CW and TA2CW to generate the watermarked MDCT coefficient set M1CW. The modification unit 430 modifies the MDCT coefficient sets 820 based on a corresponding one of the watermarked MDCT coefficient sets 860. For example, the modification unit 430 may use the watermarked MDCT coefficient set M1CW to modify the original MDCT coefficient set M1C. The modification unit 430 may then repeat the process described above for the audio blocks associated with each channel to insert the watermark 230 into the compressed digital data stream 800.
- FIG. 9 is a flow diagram depicting one manner in which the example watermark embedding system of FIG. 2 may be configured to embed or insert watermarks in a compressed digital data stream.
- the example process of FIG. 9 may be implemented as machine accessible instructions utilizing any of many different programming codes stored on any combination of machine-accessible media such as a volatile or nonvolatile memory or other mass storage device (e.g., a floppy disk, a CD, and a DVD).
- a volatile or nonvolatile memory or other mass storage device e.g., a floppy disk, a CD, and a DVD.
- the machine accessible instructions may be embodied in a machine-accessible medium such as a programmable gate array, an application specific integrated circuit (ASIC), an erasable programmable read only memory (EPROM), a read only memory (ROM), a random access memory (RAM), a magnetic media, an optical media, and/or any other suitable type of medium.
- a machine-accessible medium such as a programmable gate array, an application specific integrated circuit (ASIC), an erasable programmable read only memory (EPROM), a read only memory (ROM), a random access memory (RAM), a magnetic media, an optical media, and/or any other suitable type of medium.
- ASIC application specific integrated circuit
- EPROM erasable programmable read only memory
- ROM read only memory
- RAM random access memory
- the process begins with the identifying unit 410 ( FIG. 4 ) identifying a frame associated with the compressed digital data stream 240 ( FIG. 2 ) such as Frame A ( FIG. 5 ) (block 910).
- the identified frame may include a plurality of MDCT coefficient sets formed by overlapping and concatenating a plurality of audio blocks.
- a frame may include six MDCT coefficient sets (i.e., six ⁇ audblk").
- the identifying unit 410 also identifies header information associated with the frame (block 920). For example, the identifying unit 410 may identify the number of channels associated with the compressed digital data stream 240.
- the unpacking unit 420 then unpacks the plurality of MDCT coefficient sets to determine compression information associated with the original compression process used to generate the compressed digital data stream 240 (block 930). In particular, the unpacking unit 420 identifies the mantissa M k and the exponent X k of each MDCT coefficient m k of each of the MDCT coefficient sets. The exponents of the MDCT coefficients may then be grouped in a manner compliant with the AC-3 compression standard.
- the unpacking unit 420 ( FIG.
- Control then proceeds to block 940 which is described in greater detail below in connection with FIG. 10 .
- the modification process 940 begins by using the modifying unit 430 ( FIG. 4 ) to perform an inverse transform of the MDCT coefficient sets to generate inverse transformed time-domain audio blocks (block 1010).
- the modification unit 430 generates a previous (old) time-domain audio block (which, for example, is represented as a prime block in FIG. 5 ) and a current (new) time-domain audio block (which is represented as a double-prime block in FIG. 5 ) associated with each of the 256-sample original time-domain audio blocks used to generate the corresponding MDCT coefficient set.
- a previous time-domain audio block which, for example, is represented as a prime block in FIG. 5
- a current (new) time-domain audio block which is represented as a double-prime block in FIG. 5
- the modification unit 430 may generate TA4" and TA5' from the MDCT coefficient set MA5, TA5" and TB0' from the MDCT coefficient set MB0, and TB0" and TB1' from the MDCT coefficient set MB1. For each time-domain audio block, the modification unit 430 adds corresponding prime and double-prime blocks to reconstruct the time-domain audio block based on, for example, the Princen-Bradley TDAC technique (block 1020).
- the prime block TA5' and the double-prime block TA5" may be added to reconstruct the time-domain audio block TA5 (i.e., the reconstructed time-domain audio block TA5R) while the prime block TB0' and the double-prime block TB0" may be added to reconstruct the time-domain audio block TB0 (i.e., the reconstructed time-domain audio block TB0R).
- the modification unit 430 To insert the watermark 230, the modification unit 430 generates modifiable time-domain audio blocks using the reconstructed time-domain audio blocks (block 1030).
- the modification unit 430 generates a modifiable 512-sample time-domain audio block using two adjacent reconstructed time-domain audio blocks.
- the modification unit 430 may generate a modifiable time-domain audio block by concatenating the reconstructed time-domain audio blocks TA5R and TB0R of FIG. 5 .
- the modification unit 430 inserts the watermark 230 from the watermark source 220 into the modifiable time-domain audio blocks (block 1040).
- the modification unit 430 may insert the watermark 230 into the 512-sample time-domain audio block generated using the reconstructed time-domain audio blocks TA5R and TB0R to generate the watermarked time-domain audio blocks TA5W and TB0W.
- the modification unit 430 Based on the watermarked time-domain audio blocks and the compression information, the modification unit 430 generates watermarked MDCT coefficient sets (block 1050). As noted above, two watermarked time-domain audio blocks, where each block includes 256 samples, may be used to generate a watermarked MDCT coefficient set. For example, the watermarked time-domain audio blocks TA5W and TB0W may be concatenated and then used to generate the watermarked MDCT coefficient set MA5W.
- the modification unit 430 calculates the mantissa value associated with each of the watermarked MDCT coefficients in the watermarked MDCT coefficient set MA5W as described above in connection with FIG. 6 . In this manner, the modification unit 430 can modify or augment the original MDCT coefficient sets using the watermarked MDCT coefficient sets to embed or insert the watermark 230 in the compressed digital data stream 240 (block 1060). Following the above example, the modification unit 430 may replace the original MDCT coefficient set MA5 based on the watermarked MDCT coefficient set MA5W of FIG, 5 .
- the modification unit 430 may replace an original MDCT coefficient in the MDCT coefficient set MA5 with a corresponding watermarked MDCT coefficient (which has an augmented mantissa value) from the watermarked MDCT coefficient set MA5W.
- the modification process 940 terminates and returns control to block 950.
- the repacking unit 440 repacks the frame of the compressed digital data stream (block 950).
- the repacking unit 440 identifies the position of the MDCT coefficient sets within the frame so that the modified MDCT coefficient sets may be substituted in the positions of the original MDCT coefficient sets to rebuild the frame.
- the embedding device 210 determines that additional frames of the compressed digital data stream 240 need to be processed, then control returns to block 910. If, instead, all frames of the compressed digital data stream 240 have been processed, then the process 900 terminates.
- known watermarking techniques typically decompress a compressed digital data stream into uncompressed time-domain samples, insert the watermark into the time-domain samples, and recompress the watermarked time-domain samples into a watermarked compressed digital data stream.
- the digital data stream 240 remains compressed during the example unpacking, modifying, and repacking processes described herein.
- the watermark 230 is embedded into the compressed digital data stream 240 without additional decompression/compression cycles that may degrade the quality of the content in the compressed digital data stream 500.
- FIG. 11 depicts one manner in which a data frame (e.g., an AC-3 frame) may be processed.
- the example frame processing process 1100 begins with the embedding device 210 reading the header information of the acquired frame (e.g., an AC-3 frame) (block 1110) and initializing an MDCT coefficient set count to zero (block 1120).
- each AC-3 frame includes six MDCT coefficient sets having compressed-domain data (e.g., MA0, MA1, MA2, MA3, MA4 and MA5 of FIG. 5 , which are also known as ⁇ audblks" in the AC-3 standard).
- the embedding device 210 determines whether the MDCT coefficient set count is equal to six (block 1130). If the MDCT coefficient set count is not yet equal to six, thereby indicating that at least one more MDCT coefficient set requires processing the embedding device 210 extracts the exponent (block 1140) and the mantissa (block 1150) associated with an MDCT coefficient of the frame (e.g., the original mantissa M k described above in connection with FIG. 6 ). The embedding device 210 computes a new mantissa associated with a code symbol read at block 1220 (e.g., the new mantissa WM k described above in connection with FIG.
- the embedding device 210 increments the MDCT coefficient set count by one (block 1180) and control returns to block 1130.
- the example process of FIG. 11 is described above to include six MDCT coefficient sets (e.g., the threshold of the MDCT coefficient set count is six), a process utilizing more or fewer MDCT coefficient sets could be used instead.
- the MDCT coefficient set count is equal to six, then all MDCT coefficient sets have been processed such that the watermark has been embedded and the embedding device 210 repacks the frame (block 1190).
- a code signal (e.g., a watermark) may include information at a combination of ten different frequencies, which are detectable by a decoder using a Fourier spectral analysis of a sequence of audio samples (e.g., a sequence of 12,288 audio samples as described in detail below).
- an audio signal may be sampled at a rate of 48 kilo-Hertz (kHz) to output an audio sequence of 12,288 audio samples that may be processed (e.g., using a Fourier transform) to acquire a relatively high-resolution (e.g., 3.9 Hz) frequency domain representation of the uncompressed audio signal.
- a sinusoidal code signal having constant amplitude across an entire sequence of audio samples is unacceptable because the sinusoidal code signal may be perceptible to the human ear.
- the sinusoidal code signal is synthesized across the entire sequence of 12,288 audio samples using a masking energy analysis which determines a local sinusoidal amplitude within each block of audio samples (e.g., wherein each block of audio samples may include 512 audio samples).
- the local sinusoidal waveforms may be coherent (in-phase) across the sequence of 12,288 audio samples but have varying amplitudes based on the masking energy analysis.
- FIG. 12 depicts one manner in which a watermark, such as that disclosed by Jensen et al., may be inserted in a compressed audio signal.
- the example process 1200 begins with initializing a frame count to zero (block 1210).
- Eight frames representing a total of 12,288 audio samples of each audio channel may be processed to embed one or more code symbols (e.g., one or more of the symbols ⁇ 0", ⁇ 1", ⁇ S", and ⁇ E" shown in FIG. 13 and described in Jensen, et al.) into the audio signal.
- code symbols e.g., one or more of the symbols ⁇ 0", ⁇ 1", ⁇ S", and ⁇ E" shown in FIG. 13 and described in Jensen, et al.
- the embedding device 210 may read a watermark 230 from the watermark source 220 to inject one or more code symbols into the sequence of frames (block 1220).
- the embedding device 210 may acquire one of the frames (block 1230) and proceed to the frame processing operation 1100 described above to process the acquired frame. Accordingly, the example frame processing operation 1100 terminates and control returns to block 1250 to increment the frame count by one.
- the embedding device 210 determines whether the frame count is eight (block 1260). If the frame count is not eight, the embedding device 210 returns to acquire another frame in the sequence and repeat the example frame processing operation 1100 as described above in connection with FIG. 11 to process another frame. If, instead, the frame count is eight, the embedding device 210 returns to block 1210 to reinitialize the frame count to zero and repeat the process 1200 to process another sequence of frames.
- a code signal (e.g., the watermark 230) may be embedded or injected into the compressed digital data stream (e.g., an AC-3 data stream).
- the code signal may include a combination of ten sinusoidal components corresponding to frequency indices f 1 through f 10 to represent one of four code symbols ⁇ 0," ⁇ 1," ⁇ S,” and ⁇ E.”
- the code symbol ⁇ 0" may represent a binary value of zero and the code symbol ⁇ 1" may represent a binary value of one.
- the code symbol ⁇ S" may represent the start of a message and the code symbol ⁇ E" may represent the end of a message.
- table 1300 lists the transform bins corresponding to the center frequencies about which the ten sinusoidal components for each symbol are located.
- the 512-sample central frequency indices e.g., 10, 12, 14, 16, 18, 20, 22, 24, 26, and 28
- the 12,288-sample central frequency indices e.g., 240, 288, 336, 384, 432, 480, 528, 576, 624, and 672
- the compressed digital data stream are associated with a high resolution frequency domain representation of the compressed digital data stream.
- each code symbol may be formed using ten sinusoidal components associated with the frequency indices f 1 through f 10 depicted in table 1300.
- a code signal for injecting or embedding the code symbol ⁇ 0" includes ten sinusoidal components corresponding to the frequency indices 237, 289, 339, 383, 429, 481, 531, 575, 621, and 673, respectively.
- a code signal for injecting or embedding the code symbol ⁇ 1" includes ten sinusoidal components corresponding to the frequency indices 239, 291, 337, 381, 431, 483, 529, 573, 623, and 675, respectively.
- each of the frequency indices f 1 through f 10 has a unique frequency value at or proximate to each of the 12,288-sample central frequency indices.
- each of the ten sinusoidal components associated with the frequency indices f 1 through f 10 may be synthesized in the time domain using the methods and apparatus described herein.
- the code signal for injecting or embedding the code symbol ⁇ 0" may include sinusoids c 1 ( k ), c 2 ( k ), c 3 ( k ) , c 4 ( k ), c 5 ( k ), c 6 ( k ), c 7 ( k ), c 8 ( k ), c 9 ( k ), and c 10 ( k ) .
- the preceding equation may be used directly to compute c 1 p (m), or c 1 ( k ) may be pre-computed and appropriate segments extracted to generate c 1 p (m).
- the MDCT transform of c 1 p ( m ) includes a set of MDCT coefficient values (e.g., 256 real numbers).
- the MDCT coefficient values associated with the 512-sample frequency indices 9, 10, and 11 may have significant magnitudes because c 1 p (m) is associated with the 12,288-sample central frequency index 240, which corresponds to the 512-sample central frequency index 10.
- the MDCT coefficient values associated with other 512-sample frequency indices will be negligible relative to the MDCT coefficient values associated with the 512-sample frequency indices 9, 10, and 11 for the case of c 1 p (m).
- the code frequency index 237 (e.g., the frequency value corresponding to the frequency index f 1 associated with the code symbol ⁇ 0") causes the 512-sample central frequency index 10 to have the highest MDCT magnitude relative to the 512-sample frequency indices 9 and 11 because the 512-sample central frequency index 10 corresponds to the 12,288-sample central frequency index 240 and the code frequency index 237 is proximate to the 12,288-sample central frequency index 240.
- the second frequency index f 2 corresponding to the code frequency index 289 may produce MDCT coefficients with significant MDCT magnitudes in the 512-sample frequency indices 11, 12, and 13.
- the code frequency index 289 may cause the 512-sample central frequency index 12 to have the highest MDCT magnitude because the 512-sample central frequency index 12 corresponds to the 12,288-sample central frequency index 288 and the code frequency index 289 is proximate to the 12,288-sample central frequency index 288.
- the third frequency index f 3 corresponding to the code frequency index 339 may produce MDCT coefficients with significant MDCT magnitudes in the 512-sample frequency indices 13, 14, and 15.
- the code frequency index 339 may cause the 512-sample central frequency index 14 to have the highest MDCT magnitude because the 512-sample central frequency index 14 corresponds to the 12,288-sample central frequency index 336 and the code frequency index 339 is proximate to the 12,288-sample central frequency index 336.
- the MDCT coefficients representing the actual watermarked code signal will correspond to the 512-sample frequency indices ranging from 9 to 29.
- Some of the 512-sample frequency indices such as, for example, 9, 11, 13, 15, 17, 19, 21,23, 25, 27, and 29 maybe influenced by energy spill-over from two neighboring code frequency indices, with the amount of spill-over a function of the weighting applied to each sinusoidal component based on the masking energy analysis. Accordingly, in each 512-sample audio block of the compressed digital data stream, the MDCT coefficients may be computed as described below to represent the code signal.
- each AC-3 frame includes MDCT coefficient sets having six MDCT coefficients (e.g., MA0, MA1, MA2, MA3, MA4, and MA5 of FIG. 5 ) with each MDCT coefficient corresponding to a 512-sample audio block.
- the mantissa M k is a product of a mantissa step size s k and an integer value N k .
- the mantissa step size s k is 2/15 and the integer value N k is -2 when the original mantissa value is -0.2666 (i.e., -4/15).
- the value of the code MDCT magnitude C 11 is used to normalize and modify the values of the MDCT coefficients m 9 , m 10 , and m 11 (as well as the other MDCT coefficients in the set m 9 through m 29 ) because the code MDCT magnitude C 11 has the lowest absolute magnitude.
- the mantissa integer values N 9 and N 10 corresponding to the original MDCT coefficients m 9 and m 10 are modified relative to N 11 as follows: N 9 ⁇ > N 9 + ⁇ 1.5 * S 11 S 9 and N 10 ⁇ > N 10 + 4.0 * S 11 S 10 .
- the modified mantissa integer values N 9 , N 10 , and N 11 may be used to modify the corresponding original MDCT coefficients to embed the watermark code.
- the maximum change is limited by the upper and lower limits of its mantissa integer value N k . Referring to FIG. 6 , for example, the table 600 indicates lower limit and upper limit values of -0.9333 to +0.9333.
- the preceding example illustrates how the local masking energy may be used to determine the code magnitude for code symbols to be embedded into a compressed audio signal digital data stream.
- eight successive frames of the compressed digital data stream were modified without performing decompression of MDCT coefficients during the encoding process of the methods and apparatus described herein.
- FIG. 14 is a block diagram of an example processor system 2000 that may used to implement the methods and apparatus disclosed herein.
- the processor system 2000 may be a desktop computer, a laptop computer, a notebook computer, a personal digital assistant (PDA), a server, an Internet appliance or any other type of computing device.
- PDA personal digital assistant
- the processor system 2000 illustrated in FIG. 14 includes a chipset 2010, which includes a memory controller 2012 and an input/output (I/O) controller 2014.
- a chipset typically provides memory and I/O management functions, as well as a plurality of general purpose and/or special purpose registers, timers, etc. that are accessible or used by a processor 2020.
- the processor 2020 is implemented using one or more processors. In the alternative, other processing technology may be used to implement the processor 2020.
- the processor 2020 includes a cache 2022, which may be implemented using a first-level unified cache (L1), a second-level unified cache (L2), a third-level unified cache (L3), and/or any other suitable structures to store data.
- L1 first-level unified cache
- L2 second-level unified cache
- L3 third-level unified cache
- the memory controller 2012 performs functions that enable the processor 2020 to access and communicate with a main memory 2030 including a volatile memory 2032 and a non-volatile memory 2034 via a bus 2040.
- the volatile memory 2032 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM), and/or any other type of random access memory device.
- the non-volatile memory 2034 may be implemented using flash memory, Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), and/or any other desired type of memory device.
- the processor system 2000 also includes an interface circuit 2050 that is coupled to the bus 2040.
- the interface circuit 2050 may be implemented using any type of well known interface standard such as an Ethernet interface, a universal serial bus (USB), a third generation input/output interface (3GIO) interface, and/or any other suitable type of interface.
- One or more input devices 2060 are connected to the interface circuit 2050.
- the input device(s) 2060 permit a user to enter data and commands into the processor 2020.
- the input device(s) 2060 may be implemented by a keyboard, a mouse, a touch-sensitive display, a track pad, a track ball, an isopoint, and/or a voice recognition system.
- One or more output devices 2070 are also connected to the interface circuit 2050.
- the output device(s) 2070 may be implemented by media presentation devices (e.g., a light emitting display (LED), a liquid crystal display (LCD), a cathode ray tube (CRT) display, a printer and/or speakers).
- the interface circuit 2050 thus, typically includes, among other things, a graphics driver card.
- the processor system 2000 also includes one or more mass storage devices 2080 to store software and data.
- mass storage device(s) 2080 include floppy disks and drives, hard disk drives, compact disks and drives, and digital versatile disks (DVD) and drives.
- the interface circuit 2050 also includes a communication device such as a modem or a network interface card to facilitate exchange of data with external computers via a network.
- a communication device such as a modem or a network interface card to facilitate exchange of data with external computers via a network.
- the communication link between the processor system 2000 and the network may be any type of network connection such as an Ethernet connection, a digital subscriber line (DSL), a telephone line, a cellular telephone system, a coaxial cable, etc.
- Access to the input device(s) 2060, the output device(s) 2070, the mass storage device(s) 2080 and/or the network is typically controlled by the I/O controller 2014 in a conventional manner.
- the I/O controller 2014 performs functions that enable the processor 2020 to communicate with the input device(s) 2060, the output device(s) 2070, the mass storage device(s) 2080 and/or the network via the bus 2040 and the interface circuit 2050.
- FIG. 14 While the components shown in FIG. 14 are depicted as separate blocks within the processor system 2000, the functions performed by some of these blocks may be integrated within a single semiconductor circuit or may be implemented using two or more separate integrated circuits.
- the memory controller 2012 and the I/O controller 2014 are depicted as separate blocks within the chipset 2010, the memory controller 2012 and the I/O controller 2014 may be integrated within a single semiconductor circuit.
- the methods and apparatus disclosed herein are particularly well suited for use with data streams implemented in accordance with the AC-3 standard. However, the methods and apparatus disclosed herein may be applied to other digital audio coding techniques.
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Editing Of Facsimile Originals (AREA)
- Image Processing (AREA)
- Signal Processing For Digital Recording And Reproducing (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Description
- The present disclosure relates generally to media measurements, and more particularly, to methods and apparatus for embedding watermarks in a compressed digital data stream.
- In modern television or radio broadcast stations, compressed digital data streams are typically used to carry video and/or audio data for transmission. For example, the Advanced Television Systems Committee (ATSC) standard for digital television (DTV) broadcasts in the United States adopted Moving Picture Experts Group (MPEG) standards (e.g., MPEG-1, MPEG-2, MPEG-3, MPEG-4, etc.) for carrying video content and Digital Audio Compression standards (e.g., AC-3, which is also known as Dolby Digital®) for carrying audio content (i.e., ATSC Standard: Digital Audio Compression (AC-3), Revision A, August 2001). The AC-3 compression standard is based on a perceptual digital audio coding technique that reduces the amount of data needed to reproduce the original audio signal while minimizing perceptible distortion. In particular, the AC-3 compression standard recognizes that the human ear is unable to perceive (DVDs), digital cable, and satellite transmissions that enables the broadcast of special sound effects (e.g., surround sound).
- Existing television or radio broadcast stations employ watermarking techniques to embed watermarks within video and/or audio data streams compressed in accordance with compression standards such as the AC-3 compression standard and the MPEG Advanced Audio Coding (AAC) compression standard. Typically, watermarks are digital data that uniquely identify broadcasters and/or programs. Watermarks are typically extracted using a decoding operation at one or more reception sites (e.g., households or other media consumption sites) and, thus, may be used to assess the viewing behaviors of individual households and/or groups of households to produce ratings information.
- However, many existing watermarking techniques are designed for use with analog broadcast systems. In particular, existing watermarking techniques convert analog program data to an uncompressed digital data stream, insert watermark data in the uncompressed digital data stream, and convert the watermarked data stream to an analog format prior to transmission. In the ongoing transition towards an all-digital broadcast environment in which compressed video and audio streams are transmitted by broadcast networks to local affiliates, watermark data may need to be embedded or inserted directly in a compressed digital data stream. Existing watermarking techniques may decompress the compressed digital data stream into time-domain samples, insert the watermark data into the time-domain samples, and recompress the watermarked time-domain samples into a watermarked compressed digital data stream. Such decompression/compression may cause degradation in the quality of the media content in the compressed digital data stream. Further, existing decompression/compression techniques require additional equipment and cause delay of the audio component of a broadcast in a manner that, in some cases, may be unacceptable. Moreover, the methods employed by local broadcasting affiliates to receive compressed digital data streams from their parent networks and to insert local content through sophisticated splicing equipment prevent conversion of a compressed digital data stream to a time-domain (uncompressed) signal prior to recompression of the digital data streams.
-
US 6,611,607 B1 discloses to identifying a transform coefficient associated with a compressed digital data stream and to modify the transform coefficient to embed the watermark.US 6,505,223 B1 discloses to obtain a mantissa code. -
EP 1 104 969 A1 -
WO 97/21293 A1 WO 03/017254 A1 WO 97/21293 A1 WO 03/017254 A1 - In order to provide a watermarking technique improved as compared with existing approaches, the present invention provides solutions according to the independent claims. Preferred embodiments are defined in the respective dependent claims.
-
-
FIG. 1 is a block diagram representation of an example media monitoring system. -
FIG. 2 is a block diagram representation of an example watermark embedding system. -
FIG. 3 is a block diagram representation of an example uncompressed digital data stream associated with the example watermark embedding system ofFIG. 2 . -
FIG. 4 is a block diagram representation of an example embedding device that may be used to implement the example watermark embedding system ofFIG. 2 . -
FIG. 5 depicts an example compressed digital data stream associated with the example embedding device ofFIG. 4 . -
FIG. 6 depicts an example quantization look-up table that may be used to implement the example watermark embedding system ofFIG. 2 . -
FIG. 7 depicts another example uncompressed digital data stream that may be compressed and then processed using the example watermark embedding system ofFIG. 2 . -
FIG. 8 depicts an example compressed digital data stream associated with the example uncompressed digital data stream ofFIG. 7 . -
FIG. 9 depicts one manner in which the example watermark embedding system ofFIG. 2 maybe configured to embed watermarks. -
FIG. 10 depicts one manner in which the modification process ofFIG. 9 may be implemented. -
FIG. 11 depicts one manner in which a data frame may be processed. -
FIG. 12 depicts one manner in which a watermark may be embedded in a compressed digital data stream. -
FIG. 13 depicts an example code frequency index table that may be used to implement the example watermark embedding system ofFIG. 2 . -
FIG. 14 is a block diagram representation of an example processor system that may be used to implement the example watermark embedding system ofFIG. 2 . - In general, methods and apparatus for embedding watermarks in compressed digital data streams are disclosed herein. The methods and apparatus disclosed herein may be used to embed watermarks in compressed digital data streams without prior decompression of the compressed digital data streams. As a result, the methods and apparatus disclosed herein eliminate the need to subject compressed digital data streams to multiple decompression/compression cycles, which are typically unacceptable to, for example, affiliates of television broadcast networks because multiple decompression/compression cycles may significantly degrade the quality of media content in the compressed digital data streams.
- Prior to broadcast, for example, the methods and apparatus disclosed herein may be used to unpack the modified discrete cosine transform (MDCT) coefficient sets associated with a compressed digital data stream formatted according to a digital audio compression standard such as the AC-3 compression standard. The mantissas of the unpacked MDCT coefficient sets may be modified to embed watermarks that imperceptibly augment the compressed digital data stream. Upon receipt of the compressed digital data stream, a receiving device (e.g., a set top television metering device at a media consumption site) may extract the embedded watermark information from an uncompressed analog output such as, for example, output emanating from speakers of a television set. The extracted watermark information may be used to identify the media sources and/or programs (e.g., broadcast stations) associated with media currently being consumed (e.g., viewed, listened to, etc.) at a media consumption site. In turn, the source and program identification information may be used in known manners to generate ratings information and/or any other information that may be used to assess the viewing behaviors associated with individual households and/or groups of households.
- Referring to
FIG. 1 , anexample broadcast system 100 including aservice provider 110, atelevision 120, aremote control device 125, and areceiving device 130 is metered using an audience measurement system. The components of thebroadcast system 100 may be coupled in any well-known manner. For example, thetelevision 120 is positioned in aviewing area 150 located within a household occupied by one or more people, referred to ashousehold members 160, some or all of whom have agreed to participate in an audience measurement research study. The receivingdevice 130 may be a set top box (STB), a video cassette recorder, a digital video recorder, a personal video recorder, a personal computer, a digital video disc player, etc. coupled to thetelevision 120. Theviewing area 150 includes the area in which thetelevision 120 is located and from which thetelevision 120 may be viewed by the one ormore household members 160 located in theviewing area 150. - In the illustrated example, a
metering device 140 is configured to identify viewing information based on video/audio output signals conveyed from the receivingdevice 130 to thetelevision 120. Themetering device 140 provides this viewing information as well as other tuning and/or demographic data via anetwork 170 to adata collection facility 180. Thenetwork 170 may be implemented using any desired combination of hardwired and wireless communication links, including for example, the Internet, an Ethernet connection, a digital subscriber line (DSL), a telephone line, a cellular telephone system, a coaxial cable, etc. Thedata collection facility 180 may be configured to process and/or store data received from themetering device 140 to produce ratings information. - The
service provider 110 may be implemented by any service provider such as, for example, a cabletelevision service provider 112, a radio frequency (RF)television service provider 114, and/or a satellitetelevision service provider 116. Thetelevision 120 receives a plurality of television signals transmitted via a plurality of channels by theservice provider 110 and may be adapted to process and display television signals provided in any format such as a National Television Standards Committee (NTSC) television signal format, a high definition television (HDTV) signal format, an Advanced Television Systems Committee (ATSC) television signal format, a phase alternation line (PAL) television signal format, a digital video broadcasting (DVB) television signal format, an Association of Radio Industries and Businesses (ARIB) television signal format, etc. - The user-operated
remote control device 125 allows a user (e.g., the household member 160) to cause thetelevision 120 to tune to and receive signals transmitted on a desired channel, and to cause thetelevision 120 to process and present or deliver the programming or media content contained in the signals transmitted on the desired channel. The processing performed by thetelevision 120 may include, for example, extracting a video and/or an audio component delivered via the received signal, causing the video component to be displayed on a screen/display associated with thetelevision 120, and causing the audio component to be emitted by speakers associated with thetelevision 120. The programming content contained in the television signal may include, for example, a television program, a movie, an advertisement, a video game, a web page, a still image, and/or a preview of other programming content that is currently offered or will be offered in the future by theservice provider 110. - While the components shown in
FIG. 1 are depicted as separate structures within thebroadcast system 100, the functions performed by some of these structures may be integrated within a single unit or may be implemented using two or more separate components. For example, although thetelevision 120 and the receivingdevice 130 are depicted as separate structures, thetelevision 120 and the receivingdevice 130 may be integrated into a single unit (e.g., an integrated digital television set). In another example, thetelevision 120, the receivingdevice 130, and/or themetering device 140 may be integrated into a single unit. - To assess the viewing behaviors of
individual household members 160 and/or groups of households, a watermark embedding system (e.g., thewatermark embedding system 200 ofFIG. 2 ) may encode watermarks that uniquely identify broadcasters and/or programs in the broadcast signals from theservice providers 110. The watermark embedding system may be implemented at theservice provider 110 so that each of the plurality of media signals (e.g., television signals) transmitted by theservice provider 110 includes one or more watermarks. Based on selections by thehousehold members 160, the receivingdevice 130 may tune to and receive media signals transmitted on a desired channel and cause thetelevision 120 to process and present the programming content contained in the signals transmitted on the desired channel. Themetering device 140 may identify watermark information based on video/audio output signals conveyed from the receivingdevice 130 to thetelevision 120. Accordingly, themetering device 140 may provide this watermark information as well as other tuning and/or demographic data to thedata collection facility 180 via thenetwork 170. - In
FIG. 2 , an examplewatermark embedding system 200 includes an embeddingdevice 210 and awatermark source 220. The embeddingdevice 210 is configured to insertwatermark information 230 from thewatermark source 220 into a compresseddigital data stream 240. The compresseddigital data stream 240 may be compressed according to an audio compression standard such as the AC-3 compression standard and/or the MPEG-AAC compression standard, either of which may be used to process blocks of an audio signal using a predetermined number of digitized samples from each block. The source of the compressed digital data stream 240 (not shown) may be sampled at a rate of, for example, 48 kilohertz (kHz) to form audio blocks as described below. - Typically, audio compression techniques such as those based on the AC-3 compression standard use overlapped audio blocks and the MDCT algorithm to convert an audio signal into a compressed digital data stream (e.g., the compressed
digital data stream 240 ofFIG. 2 ). Two different block sizes (i.e., short and long blocks) may be used depending on the dynamic characteristics of the sampled audio signal. For example, AC-3 short blocks may be used to minimize pre-echo for transient segments of the audio signal and AC-3 long blocks may be used to achieve high compression gain for non-transient segments of the audio signal. In accordance with the AC-3 compression standard an AC-3 long block corresponds to a block of 512 time-domain audio samples, whereas an AC-3 short block corresponds to 256 time-domain audio samples. Based on the overlapping structure of the MDCT algorithm used in the AC-3 compression standard, in the case of the AC-3 long block, the 512 time-domain samples are obtained by concatenating a preceding (old) block of 256 time-domain samples and a current (new) block of 256 time-domain samples to create an audio block of 512 time-domain samples. The AC-3 long block is then transformed using the MDCT algorithm to generate 256 transform coefficients. In accordance with the same standard, an AC-3 short block is similarly obtained from a pair of consecutive time-domain sample blocks of audio. The AC-3 short block is then transformed using the MDCT algorithm to generate 128 transform coefficients. The 128 transform coefficients corresponding to two adjacent short blocks are then interleaved to generate a set of 256 transform coefficients. Thus, processing of either AC-3 long or AC-3 short blocks results in the same number of MDCT coefficients. In accordance with the MPEG-AAC compression standard as another example, a short block contains 128 samples and a long block contains 1024 samples. - In the example of
FIG. 3 , an uncompresseddigital data stream 300 includes a plurality of 256-sample time-domain audio blocks 310, generally shown as A0, A1, A2, A3, A4, and A5. The MDCT algorithm processes the audio blocks 310 to generate MDCT coefficient sets 320, shown by way of example as MA0, MA1, MA2, MA3, MA4, and MA5 (where MA5 is not shown). For example, the MDCT algorithm may process the audio blocks A0 and A1 to generate the MDCT coefficient set MA0. The audio blocks A0 and A1 are concatenated to generate a 512-sample audio block (e.g., an AC-3 long block) that is MDCT transformed using the MDCT algorithm to generate the MDCT coefficient set MA0 which includes 256 MDCT coefficients. Similarly, the audio blocks A1 and A2 may be processed to generate the MDCT coefficient set MA1. Thus, the audio block A1 is an overlapping audio block because it is used to generate both MDCT coefficient sets MA0 and MA1. In a similar manner, the MDCT algorithm is used to transform the audio blocks A2 and A3 to generate the MDCT coefficient set MA2, the audio blocks A3 and A4 to generate the MDCT coefficient set MA3, the audio blocks A4 and A5 to generate the MDCT coefficient set MA4, etc. Thus, the audio block A2 is an overlapping audio block used to generate the MDCT coefficient sets MA1 and MA2, the audio block A3 is an overlapping audio block used to generate the MDCT coefficient sets MA2 and MA3, the audio block A4 is an overlapping audio block used to generate the MDCT coefficient sets MA3 and MA4, etc. Together, the MDCT coefficient sets 320 form the compresseddigital data stream 240. - As described in detail below, the embedding
device 210 ofFIG. 2 may embed or insert the watermark information or watermark 230 from thewatermark source 220 into the compresseddigital data stream 240. Thewatermark 230 may be used, for example, to uniquely identify broadcasters and/or programs so that media consumption information (e.g., viewing information) and/or ratings information may be produced. Accordingly, the embeddingdevice 210 produces a watermarked compresseddigital data stream 250 for transmission. - In the example of
FIG. 4 , the embeddingdevice 210 includes an identifyingunit 410, anunpacking unit 420, amodification unit 430, and arepacking unit 440. While the operation of the embeddingdevice 210 is described below in accordance with the AC-3 compression standard, the embeddingdevice 210 may be implemented to operate with additional or other compression standards such as, for example, the MPEG-AAC compression standard. The operation of the embeddingdevice 210 is described in greater detail in connection withFIG. 5 . - To begin, the identifying
unit 410 is configured to identify one ormore frames 510 associated with the compresseddigital data stream 240, a portion of which is shown by way of example as Frame A and Frame B inFIG. 5 . As mentioned previously, the compresseddigital data stream 240 may be a digital data stream compressed in accordance with the AC-3 standard (hereinafter ∼AC-3 data stream"). While the AC-3data stream 240 may include multiple channels, for purposes of clarity, the following example describes the AC-3data stream 240 as including only one channel. In the AC-3data stream 240, each of theframes 510 includes a plurality of MDCT coefficient sets 520. In accordance with the AC-3 compression standard, for example, each of theframes 510 includes six MDCT coefficient sets (i.e., six ∼audblk"). For example, Frame A includes the MDCT coefficient sets MA0, MA1, MA2, MA3, MA4 and MA5 and Frame B includes the MDCT coefficient sets MB0, MB1, MB2, MB3, MB4 and MB5. - The identifying
unit 410 is also configured to identify header information associated with each of theframes 510, such as, for example, the number of channels associated with the AC-3data stream 240. While the example AC-3data stream 240 includes only one channel as noted above, an example compressed digital data stream having multiple channels is described below in connection withFIGS. 7 and 8 . - Returning to
FIG. 5 , the unpackingunit 420 is configured to unpack the MDCT coefficient sets 520 to determine compression information such as, for example, the parameters of the original compression process (i.e., the manner in which an audio compression technique compressed an audio signal or audio data to form the compressed digital data stream 240). For example, the unpackingunit 420 may determine how many bits are used to represent each of the MDCT coefficients within the MDCT coefficient sets 520. Additionally, compression parameters may include information that limits the extent to which the AC-3data stream 240 may be modified to ensure that the media content conveyed via the AC-3data stream 240 is of a sufficiently high quality level. The embeddingdevice 210 subsequently uses the compression information identified by the unpackingunit 420 to embed/insert the desiredwatermark information 230 into the AC-3data stream 240 thereby ensuring that the watermark insertion is performed in a manner consistent with the compression information supplied in the signal. - As described in detail in the AC-3 compression standard, the compression information also includes a mantissa and an exponent associated with each MDCT coefficient. The AC-3 compression standard employs techniques to reduce the number of bits used to represent each MDCT coefficient. Psycho-acoustic masking is one factor that may be utilized by these techniques. For example, the presence of audio energy Ek either at a particular frequency k (e.g., a tone) or spread across a band of frequencies proximate to the particular frequency k (e.g., a noise-like characteristic) creates a masking effect. That is, the human ear is unable to perceive a change in energy in a spectral region either at a frequency k or spread across the band of frequencies proximate to the frequency k if that change is less than a given energy threshold ΔEk . Because of this characteristic of the human ear, an MDCT coefficient mk associated with the frequency k may be quantized with a step size related to ΔEk without risk of causing any humanly perceptible changes to the audio content. For the AC-3
data stream 240, each MDCT coefficient mk is represented as a mantissa Mk and an exponent Xk such that mk = Mk .2 -X k. The number of bits used to represent the mantissa Mk of each MDCT coefficient of the MDCT coefficient sets 520 may be determined based on known quantization look-up tables published in the AC-3 compression standard (e.g., the quantization look-up table 600 ofFIG. 6 ). In the example ofFIG. 6 , the quantization look-up table 600 provides mantissa codes or bit patterns and corresponding mantissa values for MDCT coefficients represented by a four-bit number. As described in detail below, the mantissa Mk may be changed (e.g., augmented) to represent a modified value of an MDCT coefficient to embed a watermark in the AC-3data stream 240. - Returning to
FIG. 5 , themodification unit 430 is configured to perform an inverse transform of each of the MDCT coefficient sets 520 to generate time-domain audio blocks 530, shown by way of example as TA0', TA3", TA4', TA4", TA5', TA5", TB0', TB0", TB1', TB1", and TB5' (TA0" through TA3' and TB2' through TB4" are not shown). Themodification unit 430 performs inverse transform operations to generate sets of previous (old) time-domain audio blocks (which are represented as prime blocks) and sets of current (new) time-domain audio blocks (which are represented as double-prime blocks) associated with the 256-sample time-domain audio blocks that were concatenated to form the MDCT coefficient sets 520 of the AC-3data stream 240. For example, themodification unit 430 performs an inverse transform on the MDCT coefficient set MA5 to generate time-domain blocks TA4" and TA5', the MDCT coefficient set MB0 to generate TA5 " and TB0', the MDCT coefficient set MB1 to generate TB0" and TB1', etc. In this manner, themodification unit 430 generates reconstructed time-domain audio blocks 540, which provide a reconstruction of the original time-domain audio blocks that were compressed to form the AC-3data stream 240. To generate the reconstructed time-domain audio blocks 540, themodification unit 430 may add time-domain audio blocks based on, for example, the known Princen-Bradley time domain alias cancellation (TDAC) technique as described in Princen et al., Analysis/Synthesis Filter Bank Design Based on Time Domain Aliasing Cancellation, Institute of Electrical and Electronics Engineers (IEEE) Transactions on Acoustics, Speech and Signal Processing, Vol. ASSP-35, No. 5, pp. 1153 - 1161 (1996). For example, themodification unit 430 may reconstruct the time-domain audio block TA5 (i.e., TA5R) by adding the prime time-domain audio block TA5' and the double-prime time-domain audio block TA5" using the Princen-Bradley TDAC technique. Likewise, themodification unit 430 may reconstruct the time-domain audio block TB0 (i.e., TB0R) by adding the prime audio block TB0' and the double-prime audio block TB0" using the Princen-Bradley TDAC technique. In this manner, the original time-domain audio blocks used to form the AC-3data stream 240 are reconstructed to enable thewatermark 230 to be embedded or inserted directly into the AC-3data stream 240. - The
modification unit 430 is also configured to insert thewatermark 230 into the reconstructed time-domain audio blocks 540 to generate watermarked time-domain audio blocks 550, shown by way of example as TA0W, TA4W, TA5W, TB0W, TB1W and TB5W (blocks TA1W, TA2W, TA3W, TB2W, TB3W and TB4W are not shown). To insert thewatermark 230, themodification unit 430 generates a modifiable time-domain audio block by concatenating two adjacent reconstructed time-domain audio blocks to create a 512-sample audio block. For example, themodification unit 430 may concatenate the reconstructed time-domain audio blocks TA5R and TB0R (each being a 256-sample audio block) to form a 512-sample audio block. Themodification unit 430 may then insert thewatermark 230 into the 512-sample audio block formed by the reconstructed time-domain audio blocks TA5R and TB0R to generate the watermarked time-domain audio blocks TA5W and TB0W. Encoding processes such as those described inU.S. Patent Nos. 6,272,176 ,6,504,870 , and6,621,881 may be used to insert thewatermark 230 into the reconstructed time-domain audio blocks 540. The disclosures ofU.S. Patent Nos. 6,272,176 ,6,504,870 , and6,621,881 are hereby incorporated by reference herein in their entireties. - In the example encoding methods and apparatus described in
U.S. Patent Nos. 6,272,176 ,6,504,870 , and6,621,881 , watermarks may be inserted into a 512-sample audio block. For example, each 512-sample audio block carries one bit of embedded or inserted data of thewatermark 230. In particular, spectral frequency components with indices f1 and f2 may be modified or augmented to insert data bits associated with thewatermark 230. To insert a binary ∼1," for example, a power at the first spectral frequency associated with the index f1 may be increased or augmented to be a spectral power maximum within a frequency neighborhood (e.g., a frequency neighborhood defined by the indices f1 - 2, f1 N 1, f1, f1 + 1, and f1 + 2). At the same time, the power at the second spectral frequency associated with the index f2 is attenuated or augmented to be a spectral power minimum within a frequency neighborhood (e.g., a frequency neighborhood defined by the indices f2 N 2, f2 N1, f2, f2 + 1, and f2 + 2). Conversely, to insert a binary ∼0," the power at the first spectral frequency associated with the index f1 is attenuated to be a local spectral power minimum while the power at the second spectral frequency associated with the index f2 is increased to a local spectral power maximum. - Returning to
FIG. 5 , based on the watermarked time-domain audio blocks 550, themodification unit 430 generates watermarked MDCT coefficient sets 560, shown by way of example as MA0W, MA4W, MA5W, MB0W and MB5W (blocks MA1W, MA2W, MA3W, MB1W, MB2W, MB3W and MB4W are not shown). Following the example described above, themodification unit 430 generates the watermarked MDCT coefficient set MA5W based on the watermarked time-domain audio blocks TA5W and TB0W. Specifically, themodification unit 430 concatenates the watermarked time-domain audio blocks TA5W and TB0W to form a 512-sample audio block and converts the 512-sample audio block into the watermarked MDCT coefficient set MA5W which, as described in greater detail below, may be used to modify the original MDCT coefficient set MA5. - The difference between the MDCT coefficient sets 520 and the watermarked MDCT coefficient sets 560 represents a change in the AC-3
data stream 240 as a result of embedding or inserting thewatermark 230. As described in connection withFIG. 6 , for example, themodification unit 430 may modify the mantissa values in the MDCT coefficient set MA5 based on the differences between the coefficients in the corresponding watermarked MDCT coefficient set MA5W and the coefficients in the original MDCT coefficient set MA5. Quantization look-up tables (e.g., the look-up table 600 ofFIG. 6 ) may be used to determine new mantissa values associated with the MDCT coefficients of the watermarked MDCT coefficient sets 560 to replace the old mantissa values associated with the MDCT coefficients of the MDCT coefficient sets 520. Thus, the new mantissa values represent the change in or augmentation of the AC-3data stream 240 as a result of embedding or inserting thewatermark 230. It is important to note that, in this example implementation, the exponents of the MDCT coefficients are not changed. Changing the exponents might require that the underlying compressed signal representation be recomputed, thereby requiring the compressed signal to undergo a true decompression/compression cycle. If a modification of only the mantissa is insufficient to fully account for the difference between a watermarked and an original MDCT coefficient, the affected MDCT mantissa is set to a maximum or minimum value, as appropriate. The redundancy included in the watermarking process allows the correct watermark to be decoded in the presence of such an encoding restriction. - Turning to
FIG. 6 , the example quantization look-up table 600 includes mantissa codes and mantissa values for a fifteen-level quantization of an example mantissa Mk in the range of -0.9333 to +0.9333. While the example quantization look-up table 600 provides mantissa information associated with MDCT coefficients that are represented using four bits, the AC-3 compression standard provides quantization look-up tables associated with other suitable numbers of bits per MDCT coefficient. To illustrate one manner in which themodification unit 430 may modify a particular MDCT coefficient mk with a mantissa Mk contained in the MDCT coefficient set MA5, assume the original mantissa value is -0.2666 (i.e., -4/15). Using the quantization look-up table 600, the mantissa code corresponding to the particular MDCT coefficient mk in the MDCT coefficient set MA5 is determined to be 0101. The watermarked MDCT coefficient set MA5W includes a watermarked MDCT coefficient wmk with a mantissa value WMk.. Further, assume the new mantissa value of the corresponding watermarked MDCT coefficient wmk of the watermarked MDCT coefficient set MA5W is -0.4300, which lies between the mantissa codes of 0011 and 0100. In other words, thewatermark 230, in this example, results in a difference of -0.1667 between the original mantissa value of -0.2666 and the watermarked mantissa value of -0.4300. - To embed or insert the
watermark 230 in the AC-3data stream 240, themodification unit 430 may use the watermarked MDCT coefficient set MA5W to modify or augment the MDCT coefficients in the MDCT coefficient set MA5. Continuing with above example, eithermantissa code 0011 ormantissa code 0100 may replace themantissa code 0101 associated with the MDCT coefficient mk because the watermarked mantissa WMk associated with the corresponding watermarked MDCT coefficient wmk lies between the mantissa codes of 0011 and 0100 (because the mantissa value corresponding to the watermarked MDCT coefficient wmk is -0.4300). The mantissa value corresponding to themantissa code 0011 is -0.5333 (i.e., -8/15) and the mantissa value corresponding to themantissa code 0100 is -0.4 (i.e., -6/15). In this example, themodification unit 430 selects themantissa code 0100 instead of themantissa code 0011 to replace theoriginal mantissa code 0101 associated with the MDCT coefficient mk because the mantissa value -0.4 corresponding to themantissa code 0100 is closest to the desired watermark mantissa value -0.4300. As a result, the new mantissa bit pattern of 0100, which corresponds to the watermarked mantissa WMk of the watermarked MDCT coefficient wmk, replaces the original mantissa bit pattern of 0101. Likewise, each of the MDCT coefficients in the MDCT coefficient set MA5 may be modified in the manner described above. If a watermarked mantissa value is outside the quantization range of mantissa values (i.e., greater than 0.9333 or less than -0.9333), either the positive limit of 1110 or the negative limit of 0000 is selected as the new mantissa code, as appropriate. Additionally, and as discussed above, while the mantissa codes associated with each MDCT coefficient of an MDCT coefficient set may be modified as described above, the exponents associated with the MDCT coefficients remain unchanged. - The repacking
unit 440 is configured to repack the watermarked MDCT coefficient sets 560 associated with each frame of the AC-3data stream 240 for transmission. In particular, the repackingunit 440 identifies the position of each MDCT coefficient set within a frame of the AC-3data stream 240 so that the corresponding watermarked MDCT coefficient set can be used to modify the MDCT coefficient set. To rebuild a watermarked version of Frame A, for example, the repackingunit 440 may identify the position of and modify the MDCT coefficient sets MA0 to MA5 based on the corresponding watermarked MDCT coefficient sets MA0W to MA5W in the corresponding identified positions. Using the unpacking, modifying, and repacking processes described herein, the AC-3data stream 240 remains a compressed digital data stream while thewatermark 230 is embedded or inserted in the AC-3data stream 240. As a result, the embeddingdevice 210 inserts thewatermark 230 into the AC-3data stream 240 without additional decompression/compression cycles that may degrade the quality of the media content in the AC-3data stream 240. - For simplicity, the AC-3
data stream 240 is described in connection withFIG. 5 to include a single channel. However, the methods and apparatus disclosed herein may be applied to compressed digital data streams having audio blocks associated with multiple channels, such as 5.1 channels (i.e., five full-bandwidth channels), as described below. In the example ofFIG. 7 , an uncompresseddigital data stream 700 may include a plurality of audio block sets 710. Each of the audio block sets 710 may include audio blocks associated withmultiple channels - Each of the audio blocks associated with a particular channel in the audio block sets 710 may be processed in a manner similar to that described above in connection with
FIGS. 5 and6 . For example, the audio blocks associated with thecenter channel 810 ofFIG. 8 , shown by way of example as A0C, A1C, A2C, and A3C, may be transformed to generate the MDCT coefficient sets 820 associated with a compresseddigital data stream 800. As noted above, each of the MDCT coefficient sets 820 may be derived from a 512-sample audio block formed by concatenating a preceding (old) 256-sample audio block and a current (new) 256-sample audio block. The MDCT algorithm may then process the time-domain audio blocks 810 (e.g., A0C through A5C) to generate the MDCT coefficient sets (e.g., M0C through M5C). - Based on the MDCT coefficient sets 820 of the compressed
digital data stream 800, the identifyingunit 410 identifies a plurality of frames (not shown) and header information associated with each of the frames as described above. The header information includes compression information associated with the compresseddigital data stream 800. For each of the frames, the unpackingunit 420 unpacks the MDCT coefficient sets 820 to determine the compression information associated with the MDCT coefficient sets 820. For example, the unpackingunit 420 may identify the number of bits used by the original compression process to represent the mantissa of each MDCT coefficient in each of the MDCT coefficient sets 820. Such compression information may be used to embed thewatermark 230 as described above in connection withFIG. 6 . Themodification unit 430 then generates inverse transformed time-domain audio blocks 830, shown by way of example as TA0C", TA1C', TA1C", TA2C', TA2C", and TA3C'. The time-domain audio blocks 830 include a set of previous (old) time-domain audio blocks (which are represented as prime blocks) and a set of current (new) time-domain audio blocks (which are represented as double-prime blocks). By adding the corresponding prime blocks and double-prime blocks based on, for example, the Princen-Bradley TDAC technique, original time-domain audio blocks compressed to form the AC-3digital data stream 800 may be reconstructed (i.e., the reconstructed time-domain audio blocks 840). For example, themodification unit 430 may add the time-domain audio blocks TA1C' and TA1C" to reconstruct the time-domain audio block TA1C (i.e., TA1CR). Likewise, themodification unit 430 may add the time-domain audio blocks TA2C' and TA2C" to reconstruct the time-domain audio block TA2C (i.e., TA2CR). - To insert the
watermark 230 from thewatermark source 220, themodification unit 430 concatenates two adjacent reconstructed time-domain audio blocks to create a 512-sample audio block (i.e., a modifiable time-domain audio block). For example, themodification unit 430 may concatenate the reconstructed time-domain audio blocks TA1CR and TA2CR, each of which is a 256-sample short block, to form a 512-sample audio block. Themodification unit 430 then inserts thewatermark 230 into the 512-sample audio block formed by the reconstructed time-domain audio blocks TA1CR and TA2CR to generate the watermarked time-domain audio blocks TA1CW and TA2CW. - Based on the watermarked time-domain audio blocks 850, the
modification unit 430 may generate the watermarked MDCT coefficient sets 860. For example, themodification unit 430 may concatenate the watermarked time-domain audio blocks TA1CW and TA2CW to generate the watermarked MDCT coefficient set M1CW. Themodification unit 430 modifies the MDCT coefficient sets 820 based on a corresponding one of the watermarked MDCT coefficient sets 860. For example, themodification unit 430 may use the watermarked MDCT coefficient set M1CW to modify the original MDCT coefficient set M1C. Themodification unit 430 may then repeat the process described above for the audio blocks associated with each channel to insert thewatermark 230 into the compresseddigital data stream 800. -
FIG. 9 is a flow diagram depicting one manner in which the example watermark embedding system ofFIG. 2 may be configured to embed or insert watermarks in a compressed digital data stream. The example process ofFIG. 9 may be implemented as machine accessible instructions utilizing any of many different programming codes stored on any combination of machine-accessible media such as a volatile or nonvolatile memory or other mass storage device (e.g., a floppy disk, a CD, and a DVD). For example, the machine accessible instructions may be embodied in a machine-accessible medium such as a programmable gate array, an application specific integrated circuit (ASIC), an erasable programmable read only memory (EPROM), a read only memory (ROM), a random access memory (RAM), a magnetic media, an optical media, and/or any other suitable type of medium. Further, although a particular order of actions is illustrated inFIG. 9 , these actions can be performed in other temporal sequences. Again, the flow diagram 900 is merely provided and described in connection with the components ofFIGS. 2 to 5 as an example of one way to configure a system to embed watermarks in a compressed digital data stream. - In the example of
FIG. 9 , the process begins with the identifying unit 410 (FIG. 4 ) identifying a frame associated with the compressed digital data stream 240 (FIG. 2 ) such as Frame A (FIG. 5 ) (block 910). The identified frame may include a plurality of MDCT coefficient sets formed by overlapping and concatenating a plurality of audio blocks. In accordance with the AC-3 compression standard, for example, a frame may include six MDCT coefficient sets (i.e., six ∼audblk"). Further, the identifying unit 410 (FIG. 4 ) also identifies header information associated with the frame (block 920). For example, the identifyingunit 410 may identify the number of channels associated with the compresseddigital data stream 240. - The unpacking
unit 420 then unpacks the plurality of MDCT coefficient sets to determine compression information associated with the original compression process used to generate the compressed digital data stream 240 (block 930). In particular, the unpackingunit 420 identifies the mantissa Mk and the exponent Xk of each MDCT coefficient mk of each of the MDCT coefficient sets. The exponents of the MDCT coefficients may then be grouped in a manner compliant with the AC-3 compression standard. The unpacking unit 420 (FIG. 4 ) also determines the number of bits used to represent the mantissa of each of the MDCT coefficients so that a suitable quantization look-up table specified by the AC-3 compression standard may be used to modify or augment the plurality of MDCT coefficient sets as described above in connection withFIG. 6 . Control then proceeds to block 940 which is described in greater detail below in connection withFIG. 10 . - As illustrated in
FIG. 10 , themodification process 940 begins by using the modifying unit 430 (FIG. 4 ) to perform an inverse transform of the MDCT coefficient sets to generate inverse transformed time-domain audio blocks (block 1010). In particular, themodification unit 430 generates a previous (old) time-domain audio block (which, for example, is represented as a prime block inFIG. 5 ) and a current (new) time-domain audio block (which is represented as a double-prime block inFIG. 5 ) associated with each of the 256-sample original time-domain audio blocks used to generate the corresponding MDCT coefficient set. As described in connection withFIG. 5 , for example, themodification unit 430 may generate TA4" and TA5' from the MDCT coefficient set MA5, TA5" and TB0' from the MDCT coefficient set MB0, and TB0" and TB1' from the MDCT coefficient set MB1. For each time-domain audio block, themodification unit 430 adds corresponding prime and double-prime blocks to reconstruct the time-domain audio block based on, for example, the Princen-Bradley TDAC technique (block 1020). Following the above example, the prime block TA5' and the double-prime block TA5" may be added to reconstruct the time-domain audio block TA5 (i.e., the reconstructed time-domain audio block TA5R) while the prime block TB0' and the double-prime block TB0" may be added to reconstruct the time-domain audio block TB0 (i.e., the reconstructed time-domain audio block TB0R). - To insert the
watermark 230, themodification unit 430 generates modifiable time-domain audio blocks using the reconstructed time-domain audio blocks (block 1030). Themodification unit 430 generates a modifiable 512-sample time-domain audio block using two adjacent reconstructed time-domain audio blocks. For example, themodification unit 430 may generate a modifiable time-domain audio block by concatenating the reconstructed time-domain audio blocks TA5R and TB0R ofFIG. 5 . - Implementing an encoding process such as, for example, one or more of the encoding methods and apparatus described in
U.S. Patent Nos. 6,272,176 ,6,504,870 , and/or6,621,881 , themodification unit 430 inserts thewatermark 230 from thewatermark source 220 into the modifiable time-domain audio blocks (block 1040). For example, themodification unit 430 may insert thewatermark 230 into the 512-sample time-domain audio block generated using the reconstructed time-domain audio blocks TA5R and TB0R to generate the watermarked time-domain audio blocks TA5W and TB0W. Based on the watermarked time-domain audio blocks and the compression information, themodification unit 430 generates watermarked MDCT coefficient sets (block 1050). As noted above, two watermarked time-domain audio blocks, where each block includes 256 samples, may be used to generate a watermarked MDCT coefficient set. For example, the watermarked time-domain audio blocks TA5W and TB0W may be concatenated and then used to generate the watermarked MDCT coefficient set MA5W. - Based on the compression information associated with the compressed
digital data stream 240, themodification unit 430 calculates the mantissa value associated with each of the watermarked MDCT coefficients in the watermarked MDCT coefficient set MA5W as described above in connection withFIG. 6 . In this manner, themodification unit 430 can modify or augment the original MDCT coefficient sets using the watermarked MDCT coefficient sets to embed or insert thewatermark 230 in the compressed digital data stream 240 (block 1060). Following the above example, themodification unit 430 may replace the original MDCT coefficient set MA5 based on the watermarked MDCT coefficient set MA5W ofFIG, 5 . For example, themodification unit 430 may replace an original MDCT coefficient in the MDCT coefficient set MA5 with a corresponding watermarked MDCT coefficient (which has an augmented mantissa value) from the watermarked MDCT coefficient set MA5W. Alternatively, themodification unit 430 may compute the difference between the mantissa codes associated with the original MDCT coefficient and the corresponding watermarked MDCT coefficient (i.e., ΔMk = Mk - WMk ) and modify the original MDCT coefficient based on the difference ΔMk . In either case, after modifying the original MDCT coefficient sets, themodification process 940 terminates and returns control to block 950. - Referring back to
FIG. 9 , the repackingunit 440 repacks the frame of the compressed digital data stream (block 950). The repackingunit 440 identifies the position of the MDCT coefficient sets within the frame so that the modified MDCT coefficient sets may be substituted in the positions of the original MDCT coefficient sets to rebuild the frame. Atblock 960, if the embeddingdevice 210 determines that additional frames of the compresseddigital data stream 240 need to be processed, then control returns to block 910. If, instead, all frames of the compresseddigital data stream 240 have been processed, then theprocess 900 terminates. - As noted above, known watermarking techniques typically decompress a compressed digital data stream into uncompressed time-domain samples, insert the watermark into the time-domain samples, and recompress the watermarked time-domain samples into a watermarked compressed digital data stream. In contrast, the
digital data stream 240 remains compressed during the example unpacking, modifying, and repacking processes described herein. As a result, thewatermark 230 is embedded into the compresseddigital data stream 240 without additional decompression/compression cycles that may degrade the quality of the content in the compressed digital data stream 500. - To further illustrate the
example modification process 940 ofFIGS. 9 and10 ,FIG. 11 depicts one manner in which a data frame (e.g., an AC-3 frame) may be processed. The exampleframe processing process 1100 begins with the embeddingdevice 210 reading the header information of the acquired frame (e.g., an AC-3 frame) (block 1110) and initializing an MDCT coefficient set count to zero (block 1120). In the case where an AC-3 frame is being processed, each AC-3 frame includes six MDCT coefficient sets having compressed-domain data (e.g., MA0, MA1, MA2, MA3, MA4 and MA5 ofFIG. 5 , which are also known as ∼audblks" in the AC-3 standard). Accordingly, the embeddingdevice 210 determines whether the MDCT coefficient set count is equal to six (block 1130). If the MDCT coefficient set count is not yet equal to six, thereby indicating that at least one more MDCT coefficient set requires processing the embeddingdevice 210 extracts the exponent (block 1140) and the mantissa (block 1150) associated with an MDCT coefficient of the frame (e.g., the original mantissa Mk described above in connection withFIG. 6 ). The embeddingdevice 210 computes a new mantissa associated with a code symbol read at block 1220 (e.g., the new mantissa WMk described above in connection withFIG. 6 ) (block 1160) and modifies the original mantissa associated with the frame based on the new mantissa (block 1170). For example, the original mantissa may be modified based on the difference between the new mantissa and the original mantissa (but limited within the range associated with the bit representation of the original mantissa). The embeddingdevice 210 increments the MDCT coefficient set count by one (block 1180) and control returns to block 1130. Although the example process ofFIG. 11 is described above to include six MDCT coefficient sets (e.g., the threshold of the MDCT coefficient set count is six), a process utilizing more or fewer MDCT coefficient sets could be used instead. Atblock 1130, if the MDCT coefficient set count is equal to six, then all MDCT coefficient sets have been processed such that the watermark has been embedded and the embeddingdevice 210 repacks the frame (block 1190). - As noted above, many methods are known to embed a watermark imperceptible to the human ear (e.g., an inaudible code) in an uncompressed audio signal. For example, one known method is described in
U.S. Patent No. 6,421,445 to Jensen et al. , the disclosure of which is hereby incorporated by reference herein in its entirety. In particular, as described by Jensen et al., a code signal (e.g., a watermark) may include information at a combination of ten different frequencies, which are detectable by a decoder using a Fourier spectral analysis of a sequence of audio samples (e.g., a sequence of 12,288 audio samples as described in detail below). For example, an audio signal may be sampled at a rate of 48 kilo-Hertz (kHz) to output an audio sequence of 12,288 audio samples that may be processed (e.g., using a Fourier transform) to acquire a relatively high-resolution (e.g., 3.9 Hz) frequency domain representation of the uncompressed audio signal. However, in accordance with the encoding process of the method disclosed by Jensen et al., a sinusoidal code signal having constant amplitude across an entire sequence of audio samples is unacceptable because the sinusoidal code signal may be perceptible to the human ear. To satisfy the masking energy constraints (i.e., to ensure that the sinusoidal code signal information remains imperceptible), the sinusoidal code signal is synthesized across the entire sequence of 12,288 audio samples using a masking energy analysis which determines a local sinusoidal amplitude within each block of audio samples (e.g., wherein each block of audio samples may include 512 audio samples). Thus, the local sinusoidal waveforms may be coherent (in-phase) across the sequence of 12,288 audio samples but have varying amplitudes based on the masking energy analysis. - However, in contrast to the method disclosed by Jensen et al., the methods and apparatus described herein may be used to embed a watermark or other code signal in a compressed audio signal in a manner such that a compressed digital data stream containing the compressed audio signal remains compressed during the unpacking, modifying, and repacking processes.
FIG. 12 depicts one manner in which a watermark, such as that disclosed by Jensen et al., may be inserted in a compressed audio signal. Theexample process 1200 begins with initializing a frame count to zero (block 1210). Eight frames (e.g., AC-3 frames) representing a total of 12,288 audio samples of each audio channel may be processed to embed one or more code symbols (e.g., one or more of thesymbols ∼0", ∼1", ∼S", and ∼E" shown inFIG. 13 and described in Jensen, et al.) into the audio signal. Although the compressed digital data stream is described herein to include 12,288 audio samples, the compressed digital data stream may have more or less audio samples. The embedding device 210 (FIG. 2 ) may read awatermark 230 from thewatermark source 220 to inject one or more code symbols into the sequence of frames (block 1220). The embeddingdevice 210 may acquire one of the frames (block 1230) and proceed to theframe processing operation 1100 described above to process the acquired frame. Accordingly, the exampleframe processing operation 1100 terminates and control returns to block 1250 to increment the frame count by one. The embeddingdevice 210 determines whether the frame count is eight (block 1260). If the frame count is not eight, the embeddingdevice 210 returns to acquire another frame in the sequence and repeat the exampleframe processing operation 1100 as described above in connection withFIG. 11 to process another frame. If, instead, the frame count is eight, the embeddingdevice 210 returns to block 1210 to reinitialize the frame count to zero and repeat theprocess 1200 to process another sequence of frames. - As noted above, a code signal (e.g., the watermark 230) may be embedded or injected into the compressed digital data stream (e.g., an AC-3 data stream). As shown in the example table 1300 of
FIG. 13 and described in Jensen, et al., the code signal may include a combination of ten sinusoidal components corresponding to frequency indices f 1 through f 10 to represent one of fourcode symbols ∼0," ∼1," ∼S," and ∼E." For example, thecode symbol ∼0" may represent a binary value of zero and thecode symbol ∼1" may represent a binary value of one. Further, the code symbol ∼S" may represent the start of a message and the code symbol ∼E" may represent the end of a message. While only four code symbols are shown inFIG. 13 , more or fewer code symbols could be used instead. Additionally, table 1300 lists the transform bins corresponding to the center frequencies about which the ten sinusoidal components for each symbol are located. For example, the 512-sample central frequency indices (e.g., 10, 12, 14, 16, 18, 20, 22, 24, 26, and 28) are associated with a low resolution frequency domain representation of the compressed digital data stream and the 12,288-sample central frequency indices (e.g., 240, 288, 336, 384, 432, 480, 528, 576, 624, and 672) are associated with a high resolution frequency domain representation of the compressed digital data stream. - As noted above, each code symbol may be formed using ten sinusoidal components associated with the frequency indices f 1 through f 10 depicted in table 1300. For example, a code signal for injecting or embedding the
code symbol ∼0" includes ten sinusoidal components corresponding to thefrequency indices code symbol ∼1" includes ten sinusoidal components corresponding to thefrequency indices - Each of the ten sinusoidal components associated with the frequency indices f 1 through f 10 may be synthesized in the time domain using the methods and apparatus described herein. For example, the code signal for injecting or embedding the
code symbol ∼0" may include sinusoids c 1(k), c 2(k), c3 (k), c4 (k), c 5(k), c 6(k), c7 (k), c 8(k), c 9(k), and c 10(k). The first sinusoid c 1(k) may be synthesized in the time domain as a sequence of samples as follows:symbol ∼0," the MDCT coefficient values associated with the 512-sample frequency indices central frequency index 240, which corresponds to the 512-samplecentral frequency index 10. The MDCT coefficient values associated with other 512-sample frequency indices will be negligible relative to the MDCT coefficient values associated with the 512-sample frequency indices central frequency index 240 to produce a unit amplitude MDCT coefficient at the 512-samplecentral frequency index 10. - Continuing with the preceding example, for c 1p (m) associated with
code symbol ∼0," the code frequency index 237 (e.g., the frequency value corresponding to the frequency index f 1 associated with thecode symbol ∼0") causes the 512-samplecentral frequency index 10 to have the highest MDCT magnitude relative to the 512-sample frequency indices central frequency index 10 corresponds to the 12,288-samplecentral frequency index 240 and thecode frequency index 237 is proximate to the 12,288-samplecentral frequency index 240. Likewise, the second frequency index f 2 corresponding to thecode frequency index 289 may produce MDCT coefficients with significant MDCT magnitudes in the 512-sample frequency indices code frequency index 289 may cause the 512-samplecentral frequency index 12 to have the highest MDCT magnitude because the 512-samplecentral frequency index 12 corresponds to the 12,288-samplecentral frequency index 288 and thecode frequency index 289 is proximate to the 12,288-samplecentral frequency index 288. Similarly, the third frequency index f 3 corresponding to thecode frequency index 339 may produce MDCT coefficients with significant MDCT magnitudes in the 512-sample frequency indices code frequency index 339 may cause the 512-samplecentral frequency index 14 to have the highest MDCT magnitude because the 512-samplecentral frequency index 14 corresponds to the 12,288-samplecentral frequency index 336 and thecode frequency index 339 is proximate to the 12,288-samplecentral frequency index 336. Based on the sinusoidal components at each of the ten frequency indices f 1 through f 10, the MDCT coefficients representing the actual watermarked code signal will correspond to the 512-sample frequency indices ranging from 9 to 29. Some of the 512-sample frequency indices, such as, for example, 9, 11, 13, 15, 17, 19, 21,23, 25, 27, and 29 maybe influenced by energy spill-over from two neighboring code frequency indices, with the amount of spill-over a function of the weighting applied to each sinusoidal component based on the masking energy analysis. Accordingly, in each 512-sample audio block of the compressed digital data stream, the MDCT coefficients may be computed as described below to represent the code signal. - In the compressed AC-3 data stream, for example, each AC-3 frame includes MDCT coefficient sets having six MDCT coefficients (e.g., MA0, MA1, MA2, MA3, MA4, and MA5 of
FIG. 5 ) with each MDCT coefficient corresponding to a 512-sample audio block. As described above in connection withFIGS. 5 and6 , each MDCT coefficient is represented as mk = Mk *2-Xk = (sk*Nk )*2 -Xk , where Xk is the exponent and Mk is the mantissa. The mantissa Mk is a product of a mantissa step size sk and an integer value Nk . The mantissa step size sk and the exponent Xk maybe used to form a quantization step size Sk = sk *2-Xk . Referring to the look-up table 600 ofFIG. 6 , for example, the mantissa step size sk is 2/15 and the integer value Nk is -2 when the original mantissa value is -0.2666 (i.e., -4/15). - To inject a code signal into the compressed AC-3 data stream, modifications to the mantissa set Mk for k = 9 through 29 are determined. For example, consider a subset of the mantissa set Mk for k = 9 through 29 in which the MDCT coefficient magnitudes C 9, C 10, and C 11 corresponding to the watermarked MDCT coefficients wm9 , wm10, and wm11 are -0.3, 0.8, and 0.2, respectively (with the varying amplitude based on the local masking energy). Furthermore, assume that the code MDCT magnitude C 11 associated with the 512-sample
central frequency index 11 is the MDCT coefficient having the lowest absolute magnitude (e.g., an absolute value of 0.2) for the entire mantissa set (Ck for k = 9 through 29). The value of the code MDCT magnitude C 11 is used to normalize and modify the values of the MDCT coefficients m 9, m 10, and m 11 (as well as the other MDCT coefficients in the set m9 through m 29) because the code MDCT magnitude C 11 has the lowest absolute magnitude. First, C 11 is normalized to 1.0 and then used to normalize, for example, C 9 and C 10 as C 9 = -0.3 / C 11 = -1.5 and C 10 = 0.8 / C 11 = 4.0. Then, the mantissa integer value N 11 corresponding to the original MDCT coefficient m 11 is increased by 1 to as this is the minimum amount (due to mantissa step size quantization) by which m 11 may be modified to reflect the addition of the watermark code corresponding to C 11. Finally, the mantissa integer values N 9 and N 10 corresponding to the original MDCT coefficients m 9 and m 10 are modified relative to N 11 as follows:FIG. 6 , for example, the table 600 indicates lower limit and upper limit values of -0.9333 to +0.9333. - Thus, the preceding example illustrates how the local masking energy may be used to determine the code magnitude for code symbols to be embedded into a compressed audio signal digital data stream. Moreover, eight successive frames of the compressed digital data stream were modified without performing decompression of MDCT coefficients during the encoding process of the methods and apparatus described herein.
-
FIG. 14 is a block diagram of anexample processor system 2000 that may used to implement the methods and apparatus disclosed herein. Theprocessor system 2000 may be a desktop computer, a laptop computer, a notebook computer, a personal digital assistant (PDA), a server, an Internet appliance or any other type of computing device. - The
processor system 2000 illustrated inFIG. 14 includes achipset 2010, which includes amemory controller 2012 and an input/output (I/O)controller 2014. As is well known, a chipset typically provides memory and I/O management functions, as well as a plurality of general purpose and/or special purpose registers, timers, etc. that are accessible or used by aprocessor 2020. Theprocessor 2020 is implemented using one or more processors. In the alternative, other processing technology may be used to implement theprocessor 2020. Theprocessor 2020 includes acache 2022, which may be implemented using a first-level unified cache (L1), a second-level unified cache (L2), a third-level unified cache (L3), and/or any other suitable structures to store data. - As is conventional, the
memory controller 2012 performs functions that enable theprocessor 2020 to access and communicate with amain memory 2030 including avolatile memory 2032 and anon-volatile memory 2034 via abus 2040. Thevolatile memory 2032 may be implemented by Synchronous Dynamic Random Access Memory (SDRAM), Dynamic Random Access Memory (DRAM), RAMBUS Dynamic Random Access Memory (RDRAM), and/or any other type of random access memory device. Thenon-volatile memory 2034 may be implemented using flash memory, Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), and/or any other desired type of memory device. - The
processor system 2000 also includes aninterface circuit 2050 that is coupled to thebus 2040. Theinterface circuit 2050 may be implemented using any type of well known interface standard such as an Ethernet interface, a universal serial bus (USB), a third generation input/output interface (3GIO) interface, and/or any other suitable type of interface. - One or
more input devices 2060 are connected to theinterface circuit 2050. The input device(s) 2060 permit a user to enter data and commands into theprocessor 2020. For example, the input device(s) 2060 may be implemented by a keyboard, a mouse, a touch-sensitive display, a track pad, a track ball, an isopoint, and/or a voice recognition system. - One or
more output devices 2070 are also connected to theinterface circuit 2050. For example, the output device(s) 2070 may be implemented by media presentation devices (e.g., a light emitting display (LED), a liquid crystal display (LCD), a cathode ray tube (CRT) display, a printer and/or speakers). Theinterface circuit 2050, thus, typically includes, among other things, a graphics driver card. - The
processor system 2000 also includes one or moremass storage devices 2080 to store software and data. Examples of such mass storage device(s) 2080 include floppy disks and drives, hard disk drives, compact disks and drives, and digital versatile disks (DVD) and drives. - The
interface circuit 2050 also includes a communication device such as a modem or a network interface card to facilitate exchange of data with external computers via a network. The communication link between theprocessor system 2000 and the network may be any type of network connection such as an Ethernet connection, a digital subscriber line (DSL), a telephone line, a cellular telephone system, a coaxial cable, etc. - Access to the input device(s) 2060, the output device(s) 2070, the mass storage device(s) 2080 and/or the network is typically controlled by the I/
O controller 2014 in a conventional manner. In particular, the I/O controller 2014 performs functions that enable theprocessor 2020 to communicate with the input device(s) 2060, the output device(s) 2070, the mass storage device(s) 2080 and/or the network via thebus 2040 and theinterface circuit 2050. - While the components shown in
FIG. 14 are depicted as separate blocks within theprocessor system 2000, the functions performed by some of these blocks may be integrated within a single semiconductor circuit or may be implemented using two or more separate integrated circuits. For example, although thememory controller 2012 and the I/O controller 2014 are depicted as separate blocks within thechipset 2010, thememory controller 2012 and the I/O controller 2014 may be integrated within a single semiconductor circuit. - The methods and apparatus disclosed herein are particularly well suited for use with data streams implemented in accordance with the AC-3 standard. However, the methods and apparatus disclosed herein may be applied to other digital audio coding techniques.
- In addition, while this disclosure is made with respect to example television systems, it should be understood that the disclosed system is readily applicable to many other media systems. Accordingly, while this disclosure describes example systems and processes, the disclosed examples are not the only way to implement such systems.
- Although this disclosure describes example systems including, among other components, software executed on hardware, it should be noted that such systems are merely illustrative and should not be considered as limiting. In particular, it is contemplated that any or all of the disclosed hardware and software components could be embodied exclusively in dedicated hardware, exclusively in firmware, exclusively in software or in some combination of hardware, firmware, and/or software.
Claims (52)
- A method for embedding a watermark (230), the method comprising:identifying a transform coefficient associated with a compressed digital data stream (240), the identified transform coefficient having a first mantissa code;characterized byobtaining a second mantissa code associated with a watermarked transform coefficient by:reconstructing an additional uncompressed digital data stream from the compressed digital data stream (240);embedding the watermark (230) in the uncompressed digital data stream to determine a watermarked uncompressed data stream; andperforming a transform on the watermarked uncompressed data stream to generate the watermarked transform coefficient; andmodifying, without uncompressing the compressed digital data stream (240), the first mantissa code of the identified transform coefficient in the compressed digital data stream (240) based on the second mantissa code to embed the watermark (230) in the compressed digital data stream (240).
- The method as defined in claim 1, wherein the modifying of the first mantissa code of the identified transform coefficient comprises substituting the second mantissa code associated with the watermarked transform coefficient for the first mantissa code associated with the identified transform coefficient.
- The method as defined in claim 1, wherein the obtaining of the second mantissa code associated with the watermarked transform coefficient further comprises:selecting a code signal frequency to encode to the identified transform coefficient based on data to be embedded;obtaining a masking energy associated with the code signal frequency;selecting a magnitude for the watermarked transform coefficient based on the masking energy; anddetermining the second mantissa code associated with the watermarked transform coefficient based on the magnitude.
- The method as defined in claim 3, wherein the code signal frequency comprises a frequency corresponding to one of a plurality of high resolution frequency domain representations.
- The method as defined in claim 3, wherein the code signal frequency comprises one or more sinusoidal components, and wherein each sinusoidal component includes a frequency based on a desired code.
- The method as defined in claim 1, wherein the reconstructing of the uncompressed digital data stream comprises generating a reconstructed time-domain audio block based on a first time-domain audio block and a second time-domain audio block.
- The method as defined in claim 6, wherein the generating of the reconstructed time-domain audio block based on the first time-domain audio block and the second time-domain audio block comprises adding the first and second time-domain audio blocks.
- The method as defined in claim 6, wherein the generating of the watermarked transform coefficient further comprises:generating a modifiable time-domain audio block based on the reconstructed audio block; andgenerating a first watermarked audio block and a second watermarked audio block based on the modifiable time-domain audio block and the watermark (230).
- The method as defined in claim 8, wherein the generating of the modifiable time-domain audio block based on the reconstructed audio block comprises concatenating a first reconstructed audio block and a second reconstructed audio block to form a 512-sample audio block.
- The method as defined in claim 1, wherein the watermarked transform coefficient is generated based on a first watermarked audio block and a second watermarked audio block.
- The method as defined in claim 10, wherein the generating of the watermarked coefficient based on the first and second watermarked audio blocks comprises obtaining the second mantissa code associated with the watermarked transform coefficient based on compression information associated with the compressed digital data stream (240).
- The method as defined in claim 1, wherein the identified transform coefficient is associated with one or more modified discrete cosine transform coefficients.
- The method as defined in claim 1, wherein the compressed digital data stream (240) is compressed in accordance with an audio compression standard.
- The method as defined in claim 1, wherein the identifying of the transform coefficient comprises identifying an audio block associated with at least one of a plurality of audio channels.
- The method as defined in claim 1, further comprising identifying compression information associated with the compressed digital data stream (240).
- The method as defined in claim 1 further comprising repacking one or more frames based on a plurality of watermarked transform coefficient sets.
- The method as defined in claim 1, wherein the watermark (230) is associated with one of a media source and a media program.
- An apparatus for embedding a watermark (230), the apparatus comprising:an identifier (410) to identify a transform coefficient associated with a compressed digital data stream (240), the identified transform coefficient having a first mantissa code; anda modifier (430) to:obtain a second mantissa code associated with a watermarked transform coefficient by:reconstructing an additional uncompressed digital data stream from the compressed digital data stream (240);embedding the watermark (230) in the uncompressed digital data stream to determine a watermarked uncompressed data stream; andperform a transform on the watermarked uncompressed data stream to generate the watermarked transform coefficient; andmodify, without uncompressing the compressed digital data stream (240), the first mantissa code of the identified transform coefficient in the compressed digital data stream (240) based on the second mantissa code to embed the watermark (230) in the compressed digital data stream (240).
- The apparatus as defined in claim 18, wherein the modifier (430) substitutes the second mantissa code associated with the watermarked transform coefficient for the first mantissa code associated with the identified transform coefficient.
- The apparatus as defined in claim 19, wherein the modifier (430) selects a code signal frequency to encode to the identified transform coefficient based on data to be embedded, determines a masking energy associated with the code signal frequency, selects a magnitude for the watermarked transform coefficient based on the masking energy, and determines the second mantissa code associated with the watermarked transform coefficient based on the magnitude.
- The apparatus as defined in claim 20, wherein the code signal frequency comprises a frequency corresponding to one of a plurality of high resolution frequency domain representations.
- The apparatus as defined in claim 20, wherein the code signal frequency comprises one or more sinusoidal components, and wherein each sinusoidal component has a frequency based on a desired code.
- The apparatus as defined in claim 18, wherein the modifier (430) generates a reconstructed time-domain audio block corresponding to an audio block based on a first time-domain audio block and a second time-domain audio block to reconstruct the uncompressed digital data stream.
- The apparatus as defined in claim 23, wherein the (430) modifier adds the first and second time-domain audio blocks to generate the reconstructed time-domain audio block.
- The apparatus as defined in claim 23, wherein the modifier (430) generates a modifiable time-domain audio block based on a plurality of reconstructed time-domain audio blocks, and generates a first watermarked audio block and a second watermarked audio block based on the modifiable time-domain audio block and the watermark (230).
- The apparatus as defined in claim 25, wherein the modifier (430) concatenates a first one of the plurality of reconstructed audio blocks and a second one of the plurality of reconstructed audio blocks to form a 512-sample audio block.
- The apparatus as defined in claim 26, wherein the modifier (430) determines the second mantissa code associated with a watermarked transform coefficient based on compression information of the compressed digital data stream (240).
- The apparatus as defined in claim 18, wherein the modifier (430) generates the watermarked transform coefficient based on a first watermarked audio block and a second watermarked audio block.
- The apparatus as defined in claim 18, wherein the identified transform coefficient is associated with a plurality of transform coefficient sets comprising one or more modified discrete cosine transform coefficients.
- The apparatus as defined in claim 18, wherein the compressed digital data stream (240) is compressed in accordance with an audio compression standard.
- The apparatus as defined in claim 18, wherein the identifier (410) identifies audio blocks associated with a plurality of audio channels.
- The apparatus as defined in claim 18, further comprising an unpacker (420) that identifies compression information associated with the compressed digital data stream (240).
- The apparatus as defined in claim 18, wherein the watermark (230) is associated with one of a media source and a media program.
- The apparatus as defined in claim 18 further comprising a frame repacker (440) to repack one or more frames based on a plurality of watermarked transform coefficient sets.
- A machine accessible medium having instructions, which when executed, cause a machine to at least:identify a transform coefficient associated with a compressed digital data stream (240), the identified transform coefficient having a first mantissa code;obtain a second mantissa code associated with a watermarked transform coefficient by;reconstructing an additional uncompressed digital data stream from the compressed digital data stream (240);embedding a watermark (230) in the uncompressed digital data stream to determine a watermarked uncompressed data stream; andperforming a transform on the watermarked uncompressed data stream to generate the watermarked transform coefficient; andmodify, without uncompressing the compressed digital data stream (240), the first mantissa code of the identified transform coefficient in the compressed digital data stream (240) based on the second mantissa code to embed the watermark (230) in the compressed digital data stream (240).
- The machine accessible medium as defined in claim 35, wherein the instructions, which when executed, cause the machine to modify the first mantissa code of the transform coefficient by substituting the second mantissa code associated with the watermarked transform coefficient for the first mantissa code associated with the identified transform coefficient.
- The machine accessible medium as defined in claim 35, wherein the instructions, which when executed, cause the machine to determine the second mantissa code associated with the watermarked transform coefficient by:selecting a code signal frequency to encode to the identified transform coefficient based on data to be embedded;determining a masking energy associated with the code signal frequency;selecting a magnitude for the watermarked transform coefficient based on the masking energy; anddetermining the second mantissa code associated with the watermarked transform coefficient based on the magnitude.
- The machine accessible medium as defined in claim 37, wherein the code signal frequency comprises a frequency corresponding to one of a plurality of high resolution frequency domain representations.
- The machine accessible medium as defined in claim 37, wherein the code signal frequency comprises one or more sinusoidal components, and wherein each sinusoidal component having a frequency based on a desired code.
- The machine accessible medium as defined in claim 37, wherein the instructions, which when executed, cause the machine to reconstruct the uncompressed digital data stream by generating a reconstructed time-domain audio block based on a first time-domain audio block and a second time-domain audio block.
- The machine accessible medium as defined in claim 40, wherein the instructions, which when executed, cause the machine to generate the reconstructed time-domain audio block by adding the first and second time-domain audio blocks.
- The machine accessible medium as defined in claim 40, wherein the instructions, which when executed, cause the machine to generate the watermarked transform coefficient by:generating a modifiable time-domain audio block based on the reconstructed time-domain audio block; andgenerating a first watermarked audio block and a second watermarked audio block based on the modifiable time-domain audio block and the watermark (230).
- The machine accessible medium as defined in claim 42, wherein the instructions, which when executed, cause the machine to generate the modifiable time-domain audio block by concatenating a first reconstructed time-domain audio block and a second reconstructed time-domain audio block to form a 512-sample audio block.
- The machine accessible medium as defined in claim 35, wherein the instructions, which when executed, cause the machine to generate the watermarked transformed coefficient by generating one of a plurality of watermarked transform coefficient sets based on a first watermarked audio block and a second watermarked audio block.
- The machine accessible medium as defined in claim 44, wherein the instructions, which when executed, cause the machine to generate the one of the plurality of watermarked coefficient sets based on compression information associated with the compressed digital data stream (240).
- The machine accessible medium as defined in claim 35, wherein the identified transform coefficient is associated with a plurality of transform coefficient sets comprising one or more modified discrete cosine transform coefficients.
- The machine accessible medium as defined in claim 35, wherein the compressed digital data stream (240) is compressed in accordance with an audio compression standard.
- The machine accessible medium as defined in claim 35, wherein the instructions, which when executed, cause the machine to identify the transform coefficient by identifying an audio block associated with at least one of a plurality of audio channels.
- The machine accessible medium as defined in claim 35, wherein the instructions, which when executed, cause the machine to identify compression information associated with the compressed digital data stream (240).
- The machine accessible medium as defined in claim 35, wherein the instructions, which when executed, cause a machine to repack one or more frames based on a plurality of watermarked transform coefficient sets.
- The machine accessible medium as defined in claim 35, wherein the watermark (230) is associated with one of a media source and a media program.
- The machine accessible medium as defined in claim 35 that is one of a programmable gate array, application specific integrated circuit, erasable programmable read only memory, read only memory, random access memory, magnetic media, and optical media.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US47862603P | 2003-06-13 | 2003-06-13 | |
US57125804P | 2004-05-14 | 2004-05-14 | |
PCT/US2004/018953 WO2005008582A2 (en) | 2003-06-13 | 2004-06-14 | Methods and apparatus for embedding watermarks |
Publications (3)
Publication Number | Publication Date |
---|---|
EP1639518A2 EP1639518A2 (en) | 2006-03-29 |
EP1639518A4 EP1639518A4 (en) | 2011-09-28 |
EP1639518B1 true EP1639518B1 (en) | 2018-12-26 |
Family
ID=33555503
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP04776572.2A Expired - Lifetime EP1639518B1 (en) | 2003-06-13 | 2004-06-14 | Methods and apparatus for embedding watermarks |
Country Status (8)
Country | Link |
---|---|
EP (1) | EP1639518B1 (en) |
CN (2) | CN1823482B (en) |
AU (2) | AU2004258470B2 (en) |
CA (1) | CA2529310C (en) |
HK (2) | HK1090476A1 (en) |
TW (1) | TWI342515B (en) |
WO (2) | WO2005002200A2 (en) |
ZA (1) | ZA200510074B (en) |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20030131350A1 (en) | 2002-01-08 | 2003-07-10 | Peiffer John C. | Method and apparatus for identifying a digital audio signal |
AU2003268528B2 (en) | 2002-10-23 | 2008-12-11 | Nielsen Media Research, Inc. | Digital data insertion apparatus and methods for use with compressed audio/video data |
US7460684B2 (en) | 2003-06-13 | 2008-12-02 | Nielsen Media Research, Inc. | Method and apparatus for embedding watermarks |
TWI404419B (en) | 2004-04-07 | 2013-08-01 | Nielsen Media Res Inc | Data insertion methods , sysytems, machine readable media and apparatus for use with compressed audio/video data |
NZ552644A (en) | 2004-07-02 | 2008-09-26 | Nielsen Media Res Inc | Methods and apparatus for mixing compressed digital bit streams |
WO2008045950A2 (en) | 2006-10-11 | 2008-04-17 | Nielsen Media Research, Inc. | Methods and apparatus for embedding codes in compressed audio data streams |
EP2337021B1 (en) * | 2008-08-14 | 2018-08-22 | Sk Telecom Co., LTD | Apparatus and method for data transmission in audible frequency band |
EP2550652A4 (en) | 2010-03-25 | 2015-01-21 | Verisign Inc | Systems and methods for providing access to resources through enhanced audio signals |
US8355910B2 (en) * | 2010-03-30 | 2013-01-15 | The Nielsen Company (Us), Llc | Methods and apparatus for audio watermarking a substantially silent media content presentation |
IN2014DN03351A (en) * | 2011-10-25 | 2015-06-26 | Trigence Semiconductor Inc | |
CN102664013A (en) * | 2012-04-18 | 2012-09-12 | 南京邮电大学 | Audio digital watermark method of discrete cosine transform domain based on energy selection |
EP2680259A1 (en) | 2012-06-28 | 2014-01-01 | Thomson Licensing | Method and apparatus for watermarking an AC-3 encoded bit stream |
WO2015038546A1 (en) * | 2013-09-12 | 2015-03-19 | Dolby Laboratories Licensing Corporation | Selective watermarking of channels of multichannel audio |
CN105787444B (en) * | 2016-02-24 | 2019-03-22 | 北方工业大学 | Signal denoising method based on V system |
CN108053831A (en) * | 2017-12-05 | 2018-05-18 | 广州酷狗计算机科技有限公司 | Music generation, broadcasting, recognition methods, device and storage medium |
CN108766449B (en) * | 2018-05-30 | 2020-10-27 | 中国科学技术大学 | Reversible watermark realization method for audio signal |
CN110708376B (en) * | 2019-09-30 | 2020-10-30 | 广州竞远安全技术股份有限公司 | Processing and forwarding system and method for massive compressed files |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997021293A1 (en) * | 1995-12-06 | 1997-06-12 | Solana Technology Development Corporation | Post-compression hidden data transport |
WO2003017254A1 (en) * | 2001-08-13 | 2003-02-27 | Radioscape Limited | An encoder programmed to add a data payload to a compressed digital audio frame |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6611607B1 (en) * | 1993-11-18 | 2003-08-26 | Digimarc Corporation | Integrating digital watermarks in multimedia content |
US5682463A (en) * | 1995-02-06 | 1997-10-28 | Lucent Technologies Inc. | Perceptual audio compression based on loudness uncertainty |
US6373960B1 (en) * | 1998-01-06 | 2002-04-16 | Pixel Tools Corporation | Embedding watermarks into compressed video data |
JP2001525151A (en) * | 1998-03-04 | 2001-12-04 | コーニンクレッカ フィリップス エレクトロニクス エヌ ヴィ | Watermark detection |
EP1087377B1 (en) * | 1999-03-19 | 2007-04-25 | Sony Corporation | Additional information embedding method and its device, and additional information decoding method and its decoding device |
DE69931932T2 (en) | 1999-12-04 | 2007-05-31 | Deutsche Thomson-Brandt Gmbh | Method and apparatus for decoding and inserting a watermark into a data stream |
US6738744B2 (en) * | 2000-12-08 | 2004-05-18 | Microsoft Corporation | Watermark detection via cardinality-scaled correlation |
-
2004
- 2004-06-10 WO PCT/US2004/018645 patent/WO2005002200A2/en active Application Filing
- 2004-06-11 TW TW093117000A patent/TWI342515B/en not_active IP Right Cessation
- 2004-06-14 CA CA2529310A patent/CA2529310C/en not_active Expired - Lifetime
- 2004-06-14 EP EP04776572.2A patent/EP1639518B1/en not_active Expired - Lifetime
- 2004-06-14 AU AU2004258470A patent/AU2004258470B2/en not_active Ceased
- 2004-06-14 WO PCT/US2004/018953 patent/WO2005008582A2/en active Application Filing
- 2004-06-14 CN CN2004800202008A patent/CN1823482B/en not_active Expired - Fee Related
- 2004-06-14 CN CN201010501205XA patent/CN101950561B/en not_active Expired - Fee Related
-
2005
- 2005-12-12 ZA ZA2005/10074A patent/ZA200510074B/en unknown
-
2006
- 2006-10-03 HK HK06110940.4A patent/HK1090476A1/en not_active IP Right Cessation
-
2010
- 2010-03-09 AU AU2010200873A patent/AU2010200873B2/en not_active Ceased
-
2011
- 2011-04-18 HK HK11103846.7A patent/HK1150090A1/en not_active IP Right Cessation
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1997021293A1 (en) * | 1995-12-06 | 1997-06-12 | Solana Technology Development Corporation | Post-compression hidden data transport |
WO2003017254A1 (en) * | 2001-08-13 | 2003-02-27 | Radioscape Limited | An encoder programmed to add a data payload to a compressed digital audio frame |
Also Published As
Publication number | Publication date |
---|---|
TW200517949A (en) | 2005-06-01 |
AU2004258470B2 (en) | 2009-12-10 |
AU2010200873B2 (en) | 2012-09-06 |
ZA200510074B (en) | 2006-12-27 |
WO2005008582A2 (en) | 2005-01-27 |
TWI342515B (en) | 2011-05-21 |
AU2004258470A2 (en) | 2005-01-27 |
CN101950561B (en) | 2012-12-19 |
EP1639518A2 (en) | 2006-03-29 |
AU2004258470A1 (en) | 2005-01-27 |
HK1150090A1 (en) | 2011-10-28 |
HK1090476A1 (en) | 2006-12-22 |
CN101950561A (en) | 2011-01-19 |
WO2005002200A3 (en) | 2005-06-09 |
CN1823482A (en) | 2006-08-23 |
AU2010200873A1 (en) | 2010-04-01 |
WO2005008582A3 (en) | 2005-12-15 |
CN1823482B (en) | 2010-12-01 |
CA2529310C (en) | 2012-12-18 |
CA2529310A1 (en) | 2005-01-27 |
WO2005002200A2 (en) | 2005-01-06 |
EP1639518A4 (en) | 2011-09-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9202256B2 (en) | Methods and apparatus for embedding watermarks | |
AU2010200873B2 (en) | Methods and apparatus for embedding watermarks | |
US9286903B2 (en) | Methods and apparatus for embedding codes in compressed audio data streams | |
AU2005270105B2 (en) | Methods and apparatus for mixing compressed digital bit streams | |
AU2012261653B2 (en) | Methods and apparatus for embedding watermarks | |
AU2011203047B2 (en) | Methods and Apparatus for Mixing Compressed Digital Bit Streams | |
ZA200700891B (en) | Methods and apparatus for mixing compressed digital bit streams | |
Neubauer et al. | New high data rate audio watermarking based on SCS (scalar costa scheme) |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20051215 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL HR LT LV MK |
|
DAX | Request for extension of the european patent (deleted) | ||
A4 | Supplementary search report drawn up and despatched |
Effective date: 20110831 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: G10L 19/00 20060101AFI20110825BHEP Ipc: G06K 9/00 20060101ALI20110825BHEP Ipc: H04K 1/00 20060101ALI20110825BHEP |
|
17Q | First examination report despatched |
Effective date: 20120425 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: EXAMINATION IS IN PROGRESS |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: GRANT OF PATENT IS INTENDED |
|
INTG | Intention to grant announced |
Effective date: 20180703 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE PATENT HAS BEEN GRANTED |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PL PT RO SE SI SK TR |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R096 Ref document number: 602004053584 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: REF Ref document number: 1082521 Country of ref document: AT Kind code of ref document: T Effective date: 20190115 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: FP |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190326 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190327 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 |
|
REG | Reference to a national code |
Ref country code: AT Ref legal event code: MK05 Ref document number: 1082521 Country of ref document: AT Kind code of ref document: T Effective date: 20181226 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20190426 Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R097 Ref document number: 602004053584 Country of ref document: DE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20190927 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 |
|
REG | Reference to a national code |
Ref country code: BE Ref legal event code: MM Effective date: 20190630 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20190614 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20190630 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20190630 Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20190630 Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20190614 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20181226 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO Effective date: 20040614 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: NL Payment date: 20220626 Year of fee payment: 19 Ref country code: GB Payment date: 20220628 Year of fee payment: 19 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20220627 Year of fee payment: 19 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20220629 Year of fee payment: 19 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R119 Ref document number: 602004053584 Country of ref document: DE |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: MM Effective date: 20230701 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20230614 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: NL Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20230701 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20240103 Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20230614 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20230630 |