EP1157499A1 - Verfahren zur signalverarbeitung, vorrichtungen und anwendungen zur verwaltung digitaler rechte - Google Patents

Verfahren zur signalverarbeitung, vorrichtungen und anwendungen zur verwaltung digitaler rechte

Info

Publication number
EP1157499A1
EP1157499A1 EP00916232A EP00916232A EP1157499A1 EP 1157499 A1 EP1157499 A1 EP 1157499A1 EP 00916232 A EP00916232 A EP 00916232A EP 00916232 A EP00916232 A EP 00916232A EP 1157499 A1 EP1157499 A1 EP 1157499A1
Authority
EP
European Patent Office
Prior art keywords
data
content
embedded
auxiliary
bit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP00916232A
Other languages
English (en)
French (fr)
Other versions
EP1157499A4 (de
Inventor
Kenneth L. Levy
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digimarc Corp
Acoustic Information Processing Labs LLC
Original Assignee
Digimarc Corp
Acoustic Information Processing Labs LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/404,291 external-priority patent/US7055034B1/en
Priority claimed from US09/404,292 external-priority patent/US7197156B1/en
Application filed by Digimarc Corp, Acoustic Information Processing Labs LLC filed Critical Digimarc Corp
Publication of EP1157499A1 publication Critical patent/EP1157499A1/de
Publication of EP1157499A4 publication Critical patent/EP1157499A4/de
Withdrawn legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/91Television signal processing therefor
    • H04N5/913Television signal processing therefor for scrambling ; for copy protection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/0021Image watermarking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • H04N1/32149Methods relating to embedding, encoding, decoding, detection or retrieval operations
    • H04N1/32203Spatial or amplitude domain methods
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • H04N1/32149Methods relating to embedding, encoding, decoding, detection or retrieval operations
    • H04N1/32203Spatial or amplitude domain methods
    • H04N1/32208Spatial or amplitude domain methods involving changing the magnitude of selected pixels, e.g. overlay of information or super-imposition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • H04N1/32149Methods relating to embedding, encoding, decoding, detection or retrieval operations
    • H04N1/32203Spatial or amplitude domain methods
    • H04N1/32229Spatial or amplitude domain methods with selective or adaptive application of the additional information, e.g. in selected regions of the image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N1/32101Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N1/32144Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title embedded in the image data, i.e. enclosed or integrated in the image, e.g. watermark, super-imposed logo or stamp
    • H04N1/32149Methods relating to embedding, encoding, decoding, detection or retrieval operations
    • H04N1/32267Methods relating to embedding, encoding, decoding, detection or retrieval operations combined with processing of the image
    • H04N1/32272Encryption or ciphering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3225Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of data relating to an image, a page or a document
    • H04N2201/3233Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of data relating to an image, a page or a document of authentication information, e.g. digital signature, watermark
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/3269Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of machine readable codes or marks, e.g. bar codes or glyphs
    • H04N2201/327Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title of machine readable codes or marks, e.g. bar codes or glyphs which are undetectable to the naked eye, e.g. embedded codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N2201/00Indexing scheme relating to scanning, transmission or reproduction of documents or the like, and to details thereof
    • H04N2201/32Circuits or arrangements for control or supervision between transmitter and receiver or between image input and image output device, e.g. between a still-image camera and its memory or between a still-image camera and a printer device
    • H04N2201/3201Display, printing, storage or transmission of additional information, e.g. ID code, date and time or title
    • H04N2201/328Processing of the additional information
    • H04N2201/3281Encryption; Ciphering

Definitions

  • This invention relates to the field of signal processing, and more particularly relates to techniques useful in encoding audio, video, and other content for digital rights management purposes
  • the technology detailed below relates to digital watermarking, or "data hiding,” and its use in solving the problem of illegal copying (e g , hiding authentication information or copy protection information within the original data) Hiding auxiliary information in original data, also called steganography, has been used for thousands of years
  • steganography a message is hidden within another object or media, so that the message is essentially imperceptible to a human observer (or listener)
  • Steganography is related to, but different from, cryptography, in which the existence of a message is typically obvious, but its meaning is not ascertainable without special knowledge
  • Hidden data can be used to prevent unauthorized copying by embedding in the original data commands that are readable by the copying device and that instruct the copying device not to make a usable copy
  • Hidden data can also be used to authenticate data, that is, to prove authorship
  • One such technique entails embedding auxiliary information in an original work in such a manner that special knowledge, such as a secret algorithm or code, is required to detect and/or remove the auxiliary information The copier would not be able to remove the authentication information, and the original creator could prove his authorship by retrieving the embedded information, which would identify him as the author
  • Data hiding has uses besides the prevention and detection of unauthorized copying
  • content enhancement that is, adding information to the original data to enhance the content
  • lyrics could be embedded in audio data on a CD
  • the lyrics could be viewed in a special karaoke machine, while the audio could be played on an existing CD player
  • Hidden data could also be used to associate different segments of video data with different viewer-selectable versions of the video on a DVD For example, a viewer could select between a version edited for children or an unabridged version, and embedded auxiliary data would indicate to the DVD player which video segments to skip and which to include for the selected version
  • the original data in which the auxiliary data is hidden may represent any type of information that is perceivable with the aid of a presenting device
  • the data may represent music which is presented using a compact disk or audio DVD player, a video film that is presented on a DVD player, or an image that is presented on a computer screen or a printer
  • the auxiliary data When the combined data is presented to a user by a normal presentation device, the auxiliary data should not interfere with the use of the original data Ideally, the user should not be able to perceive the auxiliary data at all
  • increasing the amount of the embedded auxiliary data or its robustness, that is, its persistence to attack and data transformation may incidentally increase its perceptibility
  • the degree to which the auxiliary data can be perceived without having an adverse impact on the user varies with the application
  • a minor change from the original data might result in unacceptable audio artifacts
  • video data a minor change in a presented image may be acceptable, even though the change might be noticeable if the original and combined works are presented and compared side by side
  • Several techniques are known for hiding auxiliary information in original digital data Data can be hidden in original data as headers or trailers appended to the original data Such techniques are of limited use in protection of copyrighted works, because the auxiliary data is easily located and stripped out of the copy, as when changing format More sophisticated techniques distribute the auxiliary data through the original data,
  • auxiliary information can be added in the frequency domain so that the energy of the auxiliary data is spread across many frequencies in a manner similar to that of the PN sequence
  • auxiliary information can be added to the phase of the frequency components with and without spreading the information across frequencies
  • auxiliary data The ability of users to detect auxiliary data depends not only upon the data, but also upon the characteristics of the human sense organs and the interpretation of sensory stimuli by the brain.
  • Some data hiding techniques transform the original data into the frequency domain and embed auxiliary data in a manner such that the frequency spectrum of the original data reduces the perception of embedded data This psychophysical effect is known as masking
  • the frequency distribution of the original data is used to determine preferred frequencies at which the embedded auxiliary data will be less perceptible, that is, masked Others use the fact that we don't perceive phase as accurately as magnitude in the frequency domain
  • Embedded data may be susceptible to various types of corruption and attack
  • bit-rate reducing (a k a compression) schemes which remove the non-perceivable aspects of the data such as done with MPEG compression Since a feature of most embedded data is that it is non-perceivable, compression schemes will tend to remove the embedded data Even if the embedded data is designed to survive the current compression technology, the next generation technology may result in its removal
  • Bit-rate compression schemes are very important in the digital distribution of media, and receiving much research
  • noise reduction techniques, e g as are used to restore old audio recordings, pose a threat to embedded data Since most non-perceivable embedded data is similar to noise, it will be removed by these noise reduction techniques Again, even if the embedded data is designed to survive the current restoration technology, the next generation technology will probably remove it
  • the technology detailed herein relates to ID assignment and binding
  • Content providers may want to allow only the person who bought content to access (I e play, copy or record) that content
  • One way to do this is to provide content that contains an ID, and lock the ID to the consumer, the rendering device or the storage unit
  • these existing solutions of how to use the ID produce unreasonable burdens for consumers
  • a final solution links the content to the storage unit, known as media-binding
  • the storage unit includes but is not limited to a magnetic hard drive optical disk or electronic memory
  • This solution becomes cumbersome when the content should be allowed to move between different storage unit types
  • a user Joe
  • this audio can only be played in one place and to move it from Joe's stereo to his car, he has to remember to where it was "checked out", otherwise piracy cannot be controlled Importantly, he can't just listen to it from each place as desirable to the consumer
  • Another aspect of the technology detailed herein relates to scrambling of content to protect same It is often desirable to degrade digital signals so as to restrict access For instance, pay-TV broadcasts are degraded so those who haven't paid for the program cannot watch it because the picture is unclear, while those who have paid for the program see a clear picture because their recovery apparatus has been enabled Most recently, as a result of the digital audio revolution, it is desirable to restrict MP3 (a standard bit-rate compressed audio file format) access It is also desirable to produce inexpensive portable MP3 players, which in turn require that recovery of the original signal be simple
  • MP3 Motion Pictures Expert Group Layer III standard bit-rate reduced audio file format
  • This restriction can be implemented via scrambling techniques
  • it is desirable to retrieve information about the scrambled song without de- scrambhng the information as this would allow a user to learn about a song before deciding whether or not to play that song, thus improving speed of the system for the user
  • the prior-art contains numerous scrambling and de-scramblmg methods However, these methods are not designed to leave the header information alone during the scrambling and descramb ng process, thus they are unable to retrieve information about the scrambled audio without de-scrambling all of the information
  • FIG 1 is a flowchart showing acts employed in an illustrative embedding technique
  • FIG 2 is a block diagram showing an apparatus used to embed or retrieve data using the method of FIG 1
  • FIG 3 is a flowchart showing acts employed in an illustrative decoding technique
  • FIG 4 graphically displays operation of a first illustrative embodiment
  • FIG 5 is a flowchart showing the embedding of data in accordance with the first illustrative embodiment The dashed lines show interaction with the auxiliary data
  • FIG 6 is a flowchart showing the decoding of data The dashed lines show interaction with the auxiliary data
  • FIG 7 graphically displays operation of a second illustrative embodiment
  • FIG 8 is a flowchart showing the embedding of data in accordance with the second illustrative embodiment The dashed lines show interaction with the auxiliary data
  • FIG 9 is a flowchart showing the decoding of data The dashed lines show interaction with the auxiliary data
  • FIG 10 demonstrates aspects of an illustrative embodiment in conjunction with digital compression techniques
  • FIG 1 1 A and B are two block diagrams showing an embedding and retrieving apparatus in accordance with an illustrative embodiment
  • FIG 12 shows an illustrative embodiment of the apparatus of FIG 2 for embedding data
  • FIG 13 is shows an illustrative embodiment of the apparatus of FIG 2 for retrieving the data
  • FIG 14 shows a block diagram for an enabling process referenced in the discussion concerning attack resistance
  • FIG 15 shows the block diagram for a registration process
  • FIG 16 demonstrates the way in which dynamic locking blocks the duplication of the auxiliary data
  • FIG 17 displays the input and output for an exclusive-or (XOR) function
  • FIG 18A displays an overview of a process of dynamic locking and embedding of the auxiliary data
  • FIG 18B displays an overview of a process of retrieving and dynamic unlocking of the auxiliary data
  • FIG 19A shows the modification step of dynamic locking for locally masked embedded data
  • FIG 19B shows the modification step of dynamic locking for pulse width modified (PWM) embedded data
  • FIG 19C shows the modification step of dynamic locking for embodiments based upon PN sequences (Auxiliary data is abbreviated as aux )
  • FIG 20A displays the pseudocode in the form of a flowchart for locking and embedding the auxiliary data using header blocks
  • FIG 20B displays the pseudocode in the form of a flowchart for retrieving and unlocking the auxiliary data using header blocks
  • FIG 21 shows the basic process behind the example utilizations (The dotted boxes are optional The dashed boxes group similar items. In addition, although three ke locations are shown, usually only one key is used and its location depends upon the utilization requirements Finally, the abbreviation ID is used and many times refers to an identifier, but can also refer to any auxiliary information )
  • FIG 22 shows an apparatus that may be used for these robust data embedding techniques
  • FIG 23A shows an embodiment of the apparatus of FIG 22 for dynamic locking
  • FIG 24B is a block diagram showing an embodiment of the apparatus of FIG 22 for dynamic unlocking
  • FIG 25 is an overview of a process of automatic ID management
  • FIG 26 is the pseudo-code for implementing an exemplary automatic ID management process
  • FIG 27 is an apparatus to implement automatic ID management
  • FIG 28 is a portable MP3 audio player containing the apparatus of Fig 27
  • FIG 29 is an overview of a process employing two watermarks
  • FIG 30 displays the pseudocode for the embedding process of Fig 29
  • FIG 31 displays the pseudocode for the retrieving process for Fig 29
  • FIG 32 displays an apparatus that may be used in connection with the process of Fig 29
  • FIG 33a is an overview of a scrambling process, where dotted boxes are optional
  • FIG 33b is an overview of a descrambling process, where dotted boxes are optional
  • FIG 34a is the pseudo-code for an exemplary scrambling or descrambling process
  • FIG 34b shows the input and output for the exclusive-or (XOR) function
  • FIG 35 shows an exemplary apparatus for performing the scrambling or de-scrambling processes
  • FIG 36 is an overview of a degradation and recovery process
  • FIG 37 is the pseudocode for the degradation and recovery process of FIG 36
  • FIG 38 is a simple and efficient example of the degradation and recovery process using a threshold crossing and adjusting only the next point
  • FIG 39 is the pseudocode for the degradation and recovery process of FIG 38
  • FIG 40 is an overview of an apparatus suitable for implementing the process of FIGS 36-39
  • a method and apparatus of data hiding and retrieval offers high efficiency, with an attendant reduction in cost
  • psychophysic data hiding is used - without the need to modify or transform the original data - in order to identify the locations at which the data should be hidden
  • the encoding leads to essentially no detectable change in the content statistics, making the hidden signal still harder to identify and remove
  • the technology can be implemented so as to permit the user to set parameters that vary the perceptibility, robustness, and embedding rate, permitting the disclosed technology to be used in a broad variety of applications
  • An exemplary apparatus includes a logic processor and storage unit, such as those that come with a standard personal computer or on DSP boards These devices act as data readers, comparer and data writers, so that the user's desired watermark can be embedded and/or retrieved
  • An exeplary process involves embedding and retrieving auxiliary information into original data to produce combined data
  • One or more detection criteria can be used to determine where in the original data to locate and/or adjust data points so as to carry the auxiliary information
  • the detection criteria can be used to locate positions - referred to as local masking opportunities - in the original data at which the embedding of auxiliary data will produce less perception, as compared to other simplistic processes
  • the data points in the original data are investigated accordance with the detection criteria to determine the existence of local masking opportunities
  • the detection criterion or criteria may involve, for example, comparing the data point to a predetermined value and examining the relationship of the data point to nearby points If the detection criteria are met, one or more of the nearby points, or the data point being investigated, is changed to indicate the value of an embedded bit of auxiliary data
  • the investigation of each point may include not only the value of that point, but also values of one or more nearby points and/or one or more relationships among the points If the investigation of a point shows the existence of a local masking opportunity, data is embedded by setting the value of one or more of the local points, I e , either the point being investigated or one or more of the nearby points
  • the value to which the nearby data points are set in the illustrative embodiment is typically dependent upon the data point being investigated, as well as on the value of the auxiliary data bit
  • the data point value can be set so that it has a specified relationship with the neighboring data points The process is continued until the original data has been traversed or no additional auxiliary data remains to be embedded
  • Retrieving the auxiliary data is the inverse of the embedding process
  • the combined data is traversed using the detection criteria to locate the local masking opportunities As each local masking opportunity is located, the nearby data point or points that was/were set to indicate the embedded bit is/are read to extract the embedded data The process is continued until the combined data has been traversed
  • a data point or points are set to a value relative to the nearby data points, and not to an absolute value
  • Both setting data points at the local masking opportunity and setting the data point to a value related to the nearby points, rather than to a ⁇ alue unrelated to the original data, provide masking that reduces the perceptibility of the data
  • the data is extracted by determining the relationships or values of the point or points near the local masking opportunity
  • only points with large values are adjusted, and by a minimal amount, thus, these embodiments are based upon the masking of a weak stimulus by an intense stimulus
  • the process is applicable to analog and digital data
  • both embodiments are explained in terms of digital media due to current switch to digital media and the ease of understanding
  • the first preferred embodiment uses the difference between a data point after a peak and the peak level to carry auxiliary information, as long as the peak is above a large threshold and the original difference between the peak and next point is not too great This large threshold and minimal differences produce the desired perceptual masking
  • the embedding process adjusts the point after the above-threshold peaks to hide the auxiliary data
  • the retrieving process measures the difference between each above threshold peak level and the next data point to retrieve the auxiliary data
  • the second preferred embodiment uses the change in slope across a positive, large, steep, threshold crossing to hide the auxiliary information, as long as the original change in slope is not too great yet steep enough to accept the ensuing adjustment
  • the large threshold produces the desired perceptual masking
  • the embedding process adjusts the change in slope to embed the data, whereas the retrieving process measures the change in slope to obtain the auxiliary data
  • the preferred embedding process spectrally implicitly spreads the energy of the auxiliary information throughout the original data
  • This broadband approach produces data that is more difficult to remove than sub-band approaches that place the data in an inaudible frequency range
  • parameters can be chosen so that the process produces protected data that is statistically identical to unmarked data
  • the process can be adjusted to produce the desired tradeoffs between perception, coding rate and robustness to attack
  • Such embodiments preferably - although not essentially - operate on the original data without requiring any complex data transformations, such as a Fourier transformation
  • the technology can operate on original data of all types, such as in the frequency or time-frequency domain
  • it can be applied to MPEG data, including the MPEG 1 and 2 specification, ISO 1 1 172-3 and ISO 13818-7 respectively, which exists in the time- frequency domain
  • bit-rate reducing techniques known as compression
  • removing the watermark can be bypassed by using separate, but possibly identical, watermark procedures during the compression (a k a encoding) and decompression (a k a decoding) process
  • a system comprises a method and apparatus for hiding auxiliary information (or data) in original data and for retrieving the auxiliary information
  • FIG 1 is an overview of the steps involved in carrying out an illustrative method to embed data
  • FIG 2 shows a block diagram of an apparatus 10 that may be used to perform the method of FIG 1
  • Apparatus 10 includes a logic processor 14, which can be a general purpose microprocessor, such as an Intel Pentium or DEC Alpha, of the type a personal computer or engineering workstation, a digital signal processor (DSP), such the the Texas Instruments TMS320 line, a specialized CPU, such as a media processor, or a custom processing circuit
  • Apparatus 10 also includes a storage unit 18, which can include random access memory (RAM) or delays
  • RAM random access memory
  • the original data mentioned below may represent sound that is recorded by sampling its amplitude periodically, with each sample using binary numbers to represent the magnitude of the sound at a particular time Likewise, the samples may represent pixels of an image or video Still further, the original data can be any series of binary data associated into groups Similarly, the illustrative auxiliary information is any data that can be represented as " 1 "s and "0"s, but other symbol alphabets can likewise be used with corresponding adaptation of the disclosed arrangements FIG 1 shows that in step 20, a portion of the original data is read into storage unit 18 of FIG 2
  • Step 24 shows that the sample data is investigated sequentially by the logic processor 14 to locate sample points that meet predefined detection criteria
  • sample points indicate the existence of "local masking opportunities," because the detection criteria are such that a change in the value of the sample or a few samples at or near that point to embed auxiliary data will usually have minimal perceivable by the listener of the sound
  • the amount of masking will depend upon the data type and settings chosen by the user For example, the masking will be great for uncompressed audio and less for bit-rate reduced (digitally compressed) audio such as MPEG
  • bit-rate reduced (digitally compressed) audio such as MPEG
  • the same detection criteria will be applied du ⁇ ng data retrieval to locate the hidden data
  • Each point in the original data is preferably investigated to determine whether it represents a local masking opportunity
  • the criterion or criteria for determining local masking opportunities may entail not only the value of the point being investigated, but may also include the value of at least one nearby or neighboring point, or the relationship between the nearby point and the point being investigated
  • the detection criteria can require, for example, that the point being investigated exceeds a certain threshold value and/or that the point be a local maximum or peak, and/or that the point is a point of local maximum in a first- or higher-order derivative
  • the criteria may include a requirement that a point subsequent to the point being investigated have a value that differs from the point being investigated by less that a prescribed amount, or have some other relationship to the point being investigated
  • the sample data points can be considered as plotted on a graph, for example with time on the x- axis and the magnitude of the sample on the y-axis
  • the series of data points can be considered as having a slope between any points, and the value of the slope can be part of the detection criteria
  • the criteria may specify, for example, that a slope defined by the point being inv estigated and a preceding point exceed a particular value, or that the change in slope before and after the point not exceeds a particular value
  • the criteria could include any combination of requirements, the detailed examples are not essential or restrictive of the scope of the detailed technology
  • the illustrative embodiment can determine masking opportunities using only nearby or neighboring points, i e , points that are too close to use to determine useful frequency data
  • Nearby points include points that are next to the point being investigated or within a relatively small number of points, preferably less than 50 and more preferably less than 20
  • the criterion can be as simple as determining whether the point exceeds a threshold
  • Step 26 shows that when a point meeting the detection criteria is located, the value of a specified sample point or sample points near the local masking opportunity is changed to reflect the value of the auxiliary information to be embedded
  • the changed sample may be simply set to a particular value to signify the value of the embedded bit
  • the new value typically depends upon the value of both the auxiliary data and the neighboring point or points that were investigated to detect the local masking opportunity
  • the point may be set so that the change in value or slope signifies whether the embedded bit is a "1 " or a "0" (or other symbol)
  • auxiliary bit as the least significant bit, or other, preferably low order, bit
  • the embedded bit is still masked because the location of the embedded bit was chosen to represent a local masking opportunity, such as when the data is larger than a prescribed threshold
  • Step 30 shows that the process is ended at step 32 if no additional auxiliary data needs to be embedded Otherwise, step 34 shows that if there is additional data in memory, the search for local masking opportunities continues Step 36 shows that if all data in memory has not yet been searched, additional data is read into memory Skilled persons will recognize that some overlap of the data in memory may be required to prevent missing local masking opportunities that occur at the beginning or end points of the data in memory
  • FIG 3 broadly shows the steps involved in carrying out a decoding method Because the same processor and memory that were used to embed the data can be used to retrieve the data, although not necessary, the steps of FIG 3 will describe extracting data using the hardware components of FIG 2
  • Step 50 shows that a portion of the original data is read into storage unit 18
  • Step 52 shows that logic processor 14 investigates each data point to determine the existence of a local masking opportunity If a sample point meets the local masking opportunity criteria, step 54 shows that the embedded " 1 " or "0" bit of auxiliary data is extracted using the inverse relationship of how the auxiliary data was embedded
  • Step 56 shows that if additional combined data is in the memory, the logic processor continues to investigate the remaining points with step 52
  • Step 58 shows that if all the data in memory has been investigated, but there is uninvestigated combined data in the data file, additional data is read into memory in step 50
  • Step 60 shows that the process is ended when all the combined data has been investigated
  • the methodology is applicable to analog or digital data, even though the preferred embodiments use digital data
  • analog data can be sampled at the Nyquist rate to produce digital data in which additional information is hidden
  • the combined digital data can be returned to the analog domain by any existing method known in digital signal processing (DSP)
  • DSP digital signal processing
  • the analog data now contains the embedded data, which can be decoded by using sampling
  • the original data may represent pressure versus time, magnitude versus frequency, or a specific frequency magnitude versus time
  • the original data may represent gray code versus space, separate or combined RGB or equivalent values versus space, or magnitude versus frequency
  • Video data encompasses the image data with an added dimension of time available
  • the auxiliary data may be embedded in scaling factors or frequency coefficients versus frequency or time or both
  • a threshold greater than 48 dB above the minimum value is desirable This threshold allows the data to be changed with minimal perception due to masking
  • Masking is the psychological term defined as the increase in threshold for steady-state stimuli
  • Use of the term in this disclosure is much broader than that definition, and describes how one set of data reduces the perception of other data Specifically, for uncompressed, magnitude-time data, the sensitivity of the sensory system decreases with increased input level, thus the small adjustment of a neighboring data point is masked by the large value of the threshold
  • time-frequency data such as MPEG data, the masking is minimal and more similar to the textbook definition since masking has been used to reduce the bit rate
  • the first particular embodiment is based upon hiding the auxiliary information m large peaks within the original data
  • the auxiliary information is preferably broken into N bit words, with synchronization data placed between the words for better error recovery
  • the auxiliary information does not need to include sync pulses between the words if robustness to noise or modified files is not needed
  • FIG 4 conceptually shows that the first embodiment detects a peak or local maximum and sets the value of the subsequent point in relation to the peak to indicate the value of the embedded bit
  • FIG 5 includes the pseudocode in the form of a flowchart for the embedding process
  • the process begins by searching the original data until a positive peak that lies above a large threshold, labeled thr, and has a relatively small decrease after the peak, labeled dS, is found This process is demonstrated in boxes 200, 210 and 220
  • the detection criteria are checked in the most computationally efficient order, which includes first checking to see if the point represents a peak since peaks are the least likely criterion
  • the data point after the peak is adjusted according to a user defined bit depth, b, to carry the auxiliary information Specifically, if it is the beginning of an auxiliary word, the synchronization code is embedded by adjusting the point after the peak, x[n+l ], to be equal to the peak, x[n], minus half of the maximum allowable change, dS/2, between the peak and the next point, as shown in boxes 242, 230 and 250
  • An auxiliary information bit of one is encoded by adjusting the point after
  • FIG 6 displays the pseudocode in the form of a flowchart for the retrieval process of the first particular embodiment
  • the process begins by searching the original data until a positive peak that lies above a large threshold, labeled thr, and has a relatively small decrease after the peak, labeled dS, is found This process is demonstrated in boxes 300, 310 and 320 Again, the search first looks for a peak to improve efficiency
  • the difference between the peak and the data point after the peak is measured to retrieve the auxiliary information Specifically, if the peak minus the point after the peak. x[n]-x[n+l ], is close to half of the maximum allowable change, dS/2, a new auxiliary word is beginning, as shown in boxes 330 and 350 If the peak minus the point after the peak, x[n]-x[n+l ], is approximately equal to half the maximum change, dS/2, minus half the bit depth magnitude, 2 b ', an auxiliary bit of one is found If this difference, x[n]-x[n+l ], is close to the sum of half the maximum change, dS/2, and half the bit depth magnitude, 2 b ', an auxiliary bit of zero is retrieved This retrieving of zeros and ones is shown in boxes 340, 360, 370, 380, and 382 The two points immediately after retrieving the data can be skipped as shown in box 390
  • the threshold is usually around 48 dB above the minimal quantization, as discussed above
  • the threshold may be increased to reduce perception
  • the bit depth is an indication of the relative change to be made to the sample point to embed the data
  • the smaller the bit depth the less disturbance of the original data, making the embedded data less perceptible to the listener, but less robust, that is, more susceptible to being lost to noise or attack
  • Minimal perception in 16 bit audio is found when bit depths are between 1 and 6 bits
  • higher bit depths can be used if one desires more robustness to noise in trade for more perceptual degradation
  • the maximum allowable change after the peak, dS must be at least the desired bit depth magnitude, 2 b
  • the large threshold usually reduces the perceivable effect of adding the auxiliary information, and may even cause the auxiliary data to be non-perceivable, depending upon the data type
  • many data points satisfy the small difference between the peak and data point after the peak, because with a slope near 0 at the peak, the data is changing the least This small difference means that the adjustment will be small as compared to the threshold, thus reducing the chance of perceiving the embedded auxiliary data
  • the pseudocode is shown using a buffer with what appears to be look ahead capabilities (I e x[n+l ]) This makes the process easier to explain and understand However, the process is causal, as determined by replacing n+1 with k, and keeping track of the last two points, x[k-l ] and x[k-2]
  • the peak extends for one more point each direction where x[n]>x[n-2], x[n]>x[n+2], x[n]>x[n-3], x[n]>x[n+3], and so on, or the peak is of minimal sharpness, l e x[n]-x[n-l ]>5
  • bit rates of between 99 and 268 bits per second were achieved in CD quality audio data using a bit depth of 5 and a threshold of 5,000 (74 dB) Using a bit depth of 8 and maintaining a threshold at 5,000, the average embedding rate was 1 ,000 bits per second When the threshold is lowed to 2,000 at a bit depth of 8,
  • the second particular embodiment hides the auxiliary information in large, steep threshold crossings which do not have a large change in slope
  • the method is more robust to noise changing the detected location This occurs because it is less likely that noise changes the location of a threshold crossing as compared to a peak, since a threshold crossing usually has a slope larger than the slope at the peak, which, by definition, has a slope near zero Testing with audio data has shown this embodiment, as compared to the first embodiment, to produce a lower embedded data rate and is more perceivable at a lower bit depth, in trade for the robustness to noise
  • FIG 7 conceptually shows that data is embedded by setting the slope after the threshold crossing in relation to the slope at the threshold crossing
  • the pseudocode for hiding the auxiliary information using the second preferred embodiment is presented in the form of a flow chart
  • the process begins by searching the original data until a positive, large, steep threshold (labeled thr) crossing with minimal change in slope (labeled dS) is found This process is demonstrated in boxes 400, 410 and 420
  • the data point after the threshold crossing is adjusted according to a user defined bit depth (b) to carry the auxiliary information in the change in slope
  • the change in slope is defined as (x[n+l ]-x[n])-(x[n]-x[n-l]), or equivalently as x[n+l ]- 2*x[n]+x[n-l ]
  • the synchronization code is embedded by adjusting the point after the threshold crossing, x[n+l ], so that the change in slope is zero, as shown in boxes 442.
  • An auxiliary bit of one is encoded by adjusting the point after the threshold crossing, x[n+l ], so that the change in slope is positive by an amount equal to half the bit depth magnitude, 2 b '
  • an auxiliary bit of zero is encoded by adjusting the point after the threshold crossing so that the change in slope is negative by an amount equal to half the bit depth magnitude, 2 b '
  • This embedding of zeros and ones is shown in boxes 442, 440, 460, 470 and 480 The point after embedding the data can be skipped for efficiency as shown m box 490
  • FIG 9 shows the pseudocode in the form of a flowchart for the retrieval of the auxiliary information in the second preferred embodiment
  • the process begins by searching the original data until a positive, large, steep threshold (labeled thr) crossing with minimal change in slope (labeled dS), is found This process is demonstrated in boxes 500.
  • the change in slope around the threshold is measured to retrieve the auxiliary information
  • the change in slope is defined as (x[n+ l ]-x[n])- (x[n]-x[n-l ]), or equivalently as x[n+ l ]-2*x[n]+x[n-l ]
  • a new auxiliary word is begun, as shown in boxes 530 and 550
  • the threshold crossing has a positive change in slope approximately equal to half the bit depth magnitude, 2 b"1
  • an auxiliary bit of one is found
  • the threshold crossing has a negative change in slope approximately equal to half the bit depth magnitude, 2 b '
  • an auxiliary bit of zero is retrieved This retrieving of zeros and ones is shown in boxes 540, 560, 570, 580, and 582
  • the point after retrieving the data can be skipped for efficiency as shown in box 590
  • the pre-threshold change condition, x[n]-x[n-l]>dS+2 b ', in the detection criteria of box 420 and 520 requires that the adjustment of the next data point does not bring the point back below the threshold
  • the large threshold and maximum allowable change in slope condition, dS reduce the perception of embedding the auxiliary data, and depending upon the data type can cause the embedding process to be completely non-perceivable
  • the maximum allowable change in slope condition, dS can have any value A larger value allows a higher data rate with more perceivable distortion, whereas a smaller value produces minimal distortion with a lower data rate
  • Our preferred setting for dS in 16 bit audio is equal to the bit depth magnitude, 2 b Again, bit depths below 6 bits produce minimal distortion, but higher bit depths can be used for robustness to noise and attack
  • a very simple embodiment could use a simple threshold to determine a local masking opportunity and then encode the auxiliary data in the LSB of the point exceeding the threshold or of another point in the vicinity of the point exceeding the threshold Such a variation is extremely simple, yet provides reduced perceptibility compared to prior art LSB schemes As with the other embodiments, one must ensure that changing the value does not remov e the point for the detection criterion In this case, one could simply skip embedding where the change brings the data below the threshold, and change the current value of the data point to the threshold so that the data point will be skipped in the retrieving phase
  • Attack is defined as a person or machine trying to remove the auxiliary information from the combined signal without distorting the perception of the original data )
  • a dynamic threshold can make it harder to remove the auxiliary information
  • An example dynamic threshold is an offset sinusoidal waveform
  • dS should be small and close to 2 b so that the process does not change the distribution of the differences between neighboring points, l e be statistically invisible, thus, an attacker cannot use this data to find the threshold
  • a DC shift is obviously a more potent attack for the second preferred embodiment than the first, but could affect the first preferred embodiment since threshold is one of the detection criteria
  • the process may use more global definitions for peaks and threshold crossings, for better robustness to noise Specifically, a peak or threshold crossing definition may be used that includes more points on each side
  • auxiliary information does not need to include the extra sync pulses between the N-bit words, especially if robustness to noise is not needed
  • negative going peaks and/or more thresholds can be used to increase bit rate
  • the process can use more than a binary system in adjusting the second bit to encode more information
  • the result is more likely to be perceivable or less robust to attack
  • An interesting twist is to embed different auxiliary information on positive and negative peaks, and/or on various thresholds
  • the channels can be coded separately, or encoding can move between channels with consecutive points moving between left and right channels
  • a change that can improve the perception is to move the data point after the embedded point towards the value of the embedded point if combining the auxiliary information causes a large value change in the embedded point
  • the data does not have to be relative to time
  • the data may represent magnitude versus frequency
  • the data could be viewed as magnitude of a specific frequency versus time All frequencies may be included for an increased data rate
  • embedding may be performed in the spectrum or spectrogram
  • bit-reduced data such as MPEG compressed data
  • MPEG-compressed data comprises a series of data points that represent scaling factors and frequency coefficients
  • Auxiliary data may be embedded in the series of MPEG data points, using, for example, one of the two particular embodiments described above
  • one may want to increase the peak or modify its LSB such that the term is only increased, rather than decrease the point after the peak, such that quantization error is not increased in the MPEG data, especially when dealing with scaling factors Skilled persons will recognize that, in using data like MPEG data that is divided into time frames, one may use, for example
  • the process can be used to embed copyright information
  • This information may include a code to determine if the data can be copied
  • Copying devices such as CD writers, can include an inexpensive integrated circuit that could interpret embedded data and prohibit copying
  • author's or artist's name and affiliation can be embedded In this utilization, the auxiliary information is small and would be repeated over and over with synchronization pulses between each duplication
  • the copy code could be embedded using embodiment 1 , and the creator's name and affiliation using embodiment 2 (l e , several embedded data may co-exist in a work)
  • Such technology can be used to send additional information
  • This information may be transmitted in ASCII or ANSI with 8 bit "words" (not to be included with digital words being defined as
  • the information may be a secret message, lyrics to the song, or a description of the artwork For lyrics, this could be useful for karaoke machines and CD or DVD players
  • FIGS 10A and B demonstrates an illustrative process for data hiding, if at some point the data must be compressed For example, this may happen while transmitting the data
  • the auxiliary information is embedded in the non-compressed data using the above- detailed process or any other method, as shown in box 600
  • the auxiliary information is retrieved via the above-detailed or the other appropriate method, and re- embedded in the compressed data with the above-detailed process or any other method, as shown in box 610
  • the algorithm for data hiding in the compressed and non-compressed data may be the same algorithm, differing by only using different original data Or they may be different
  • the auxiliary information is retrieved from the compressed data by the above- detailed or other method, the data is uncompressed, and the auxiliary information is embedded in the uncompressed data, as shown in box 620 Finally, when needed, the auxiliary information can be retrieved from the data using the above-detailed or other method, as shown in 630
  • the algorithm for data hiding in the compressed and non-compressed data may be the same algorithm, differing by only using different original data, or not
  • FIG 12 shows the implementation with a digital processor 1200 and digital memory 1210
  • the digital processor 1200 may be defined as the equivalent of a digital signal processor (DSP), general -purpose central processing unit (CPU), or a specialized CPU, including media processors
  • DSP digital signal processor
  • CPU general -purpose central processing unit
  • CPU central processing unit
  • a specialized CPU including media processors
  • a likely DSP chip is one of the Texas Instruments TMS320 product line
  • a CPU could include one of Intel's Pentium line or Motorola IBM's PowerPC product line
  • the design is straightforward for someone familiar with the state of the art given the pseudocode in Figs 5 through 9
  • FIG 13 a person familiar with the state of the art could implement the process with analog and digital circuitry, either separate or in an application specific integrated circuit (ASIC)
  • the analog and digital circuitry could include any combination of the following devices a digital-to-analog converter (D/A), comparators, sample-and-hold circuits, delay elements, analog-to- digital converter (
  • FIGs 1 1 A and B show that the logic processor and storage unit typically comprise an embedding apparatus 700 and retrieving apparatus 770
  • the embedding apparatus 700 includes the following A data reader 710 to read original data 720 and auxiliary data 730
  • a comparer 740 that is, a circuit or device for comparing data points with known values or other data points
  • a data writer 750 to write the combined data 760 to a permanent or temporary storage media
  • the retrieving apparatus 770 includes the following A data reader 715 to read the combined data
  • the data reader 715 may be identical to the embedding data reader 710, but it also may be different A comparer 745, that is, a circuit or device for comparing data points with known values or other data points and, if necessary, producing the auxiliary bit or bits
  • the comparer 745 may be identical or different that the embedding comparer 740
  • a data writer is not always necessary since the auxiliary information may be taken from memory or only displayed for the corresponding use
  • attack is used herein, but is meant to include both deliberate efforts to remove embedded data, and incidental removal of such data Attack may include duplication, which is defined as being able to replicate or impersonate the embedded data from one data segment to another Attack may also include modification, which is defined as changing the embedded data for a desired affect, such as from "no copying" to
  • the first embodiment utilizes an enabling process, which involves using embedded data to enable an action, such as copying, playing or otherwise rendering
  • an action such as copying, playing or otherwise rendering
  • the second embodiment utilizes a registration process, where the recording device embeds its registration in the data
  • the recording device can refer to a physical device, such as a
  • the dynamic locking and unlocking embodiments improve the robustness of existing or future embedded data techniques to duplication and/or modification Dynamic locking causes the embedded data to be dependent upon the media, e g , by including one or both of the following steps
  • the first step includes modifying the auxiliary information by the media
  • the second step includes encrypting the auxiliary information, possibly modified in the first step
  • the encryption technique may be RSA, DES or any appropriate algorithm
  • a media player such as a computer with MP3 software player, contacts an Internet site to download media, such as a song in MP3 format
  • the player sends its unique identifier to the Internet site, where the identifier is modified using the original data and the result is encrypted
  • the modified and encrypted identifier is then embedded in the original data, and the combined data is downloaded to the player
  • the media player is able to extract the identifier from the combined data, and compare it to its own identifier If these identifiers are identical and any additional information, such as a date limit, is verified, the player will play the data If the combined data is copied to a second player having a different identifier, the second player will not play the combined data
  • the encryption key also requires proper handling, and the identifier may include additional information besides the player identifier
  • An example of the second utilization includes, rather than a unique identifier, a predefined copy code such as "allow no copying,” “allow copying one time, but not copying of a copy,” and “allow unlimited copying "
  • the recorder would retrieve the copy code and not copy unless permitted by the code
  • the copy would either contain no "allow copying one time " code, or contain an "allow no copying" code
  • both the player and the broadcast unit would know the code beforehand (l e predefined) or the code would be included in the broadcast
  • the third example utilization two approaches are described In the first approach, a DVD player will not play the DVD without retrieving the predefined identifier embedded in the original data For extra security the identifier could be encrypted with a key located at a central database or in a section of the DVD not available for copy
  • the identifier could control the number of generation of copies allowed, noting that if no identifier exists, no copies can be made Or, there could be two layered identifiers for both types of copy management
  • the fourth example utilization involves embedding secure data in the picture of a photo-card, as in a photo used for identification purposes like a driver's license or credit card If the retrieved information at the photo-card reader does not match that of the central database, the card is recognized as a fake and will not be authorized for use Note that the information and key exchange must be securely transmitted
  • the fifth example utilization allows the secure transmission of secret information, hidden in the media Most bystanders will not know the secret message is attached If found, the hidden information cannot be read by, modified by, and/or transferred to other media by an imposter when the embedded data is dependent upon the media and encrypted Different types of encryption, symmetric or public/private key, can be used for creating the desired protection or authentication of the embedded data This hidden information enables a person or machine on the receiving side to perform an action
  • the exemplary apparatus for these processes involves a logic processor, possibly including DSP chips, host CPUs or custom analog or digital circuitry, and memory
  • a logic processor possibly including DSP chips, host CPUs or custom analog or digital circuitry, and memory
  • Media or content includes, but is not limited to, audio, video, still images, combinations of the above, and forms related to other senses
  • the terms media and content are used interchangeably
  • Media does not refer to a storage medium
  • a media or content segment includes, but is not limited to, a song, part of a song, movie, part of a movie, part or all of a sound track, part or all of a still image, a taste, a touch, and an odor
  • Original data is the raw, unprotected data
  • the auxiliary information refers to any data that is to be embedded in the original data
  • the ID 140 in Fig 21 refers to this auxiliary information, and may include but is not limited to, information such as the player ID, number of copies allowed, usage time or date limits, and content enhancement information such as author, copyright, publisher, song lyrics or image details
  • the embedded data is the data that is actually embedded in the original data
  • the embedded data differs from the auxiliary data by the transformation used in the embedding process This transformation can include the
  • Fig 14 demonstrates an exemplary enabling process This process uses a logic processor 900 and memory 910, as shown in Fig 22 First, as shown in box 10, processor 900 retrieves the auxiliary data from the combined data 5 and stores it in memory 910 Then, processor 900 determines whether the embedded data allows the desired action, as shown in box 20 If so, the desired action is allowed, as shown in box 30 If not, the desired action is disallowed, as shown in box 40
  • Registration Process Fig 15 demonstrates the registration process This process involves assigning a unique registration code 305 to each recording device 300, and embedding the registration code 305 into the media when it is recorded, as shown in box 310 Then, when illegal media is found in an open-market
  • This recording device may be a physical device or virtual device
  • a physical device could include a CD or DVD burner
  • a virtual device could include a software program using processor 900 and memory 910 to digitally compress (bit rate reduce) audio, such as a MP3 ripper or AAC encoder
  • Fig 16 displays the way in which dynamic locking blocks the duplication of the auxiliary data Duplication is blocked for both bit-for-bit copying of the embedded data between content, and retrieving the embedded information, and re-embedding it into different content such that the different content appears authentic
  • Fig 18A displays an overview of the dynamic locking and embedding process
  • the whole process contains three steps, but either one (not both) of the first two steps, I e those steps of dynamic locking, can be skipped
  • the order of the last two steps can be switched
  • This switch is beneficial when the content, including the embedded data, is encrypted, usually for other content protection reasons, or when the modification step has some of the desirable features such as requiring a key to be unmodified
  • the auxiliary data (d) which is to be embedded, is modified based upon the original content (c)
  • This step is designed to modify the auxiliary data to be dependent upon the original content such that the embedded data cannot be copied bit-for-bit between content
  • the chosen content bits should be critical to the content, such that they cannot be changed in new content to make it appear authentic
  • a desirable function is the exclusive-or (XOR) operator since this function is its own inverse and efficiently implemented on digital processors
  • the modified data is encrypted such that the original auxiliary bits cannot be obtained from the embedded data
  • the original auxiliary bits cannot be re-embedded in different content, making this different content appear authentic
  • the auxiliary data is not modified by the original content before being encrypted, it could be copied bit-for-bit from the original content to new content making the new content appear authentic
  • the encrypted and modified (labeled dynamically locked) auxiliary data is embedded into the original content
  • Fig 18B displays an overview of the process used to retrieve and dynamically unlock the auxiliary data
  • the whole process contains three steps, and each step should only be performed if the corresponding step was performed when the data was embedded In addition, if the order of the last two steps was switched while embedding, these two corresponding steps should be switched during this retrieval process
  • the embedded data is retrieved from the content At this time, the embedded data consists of encrypted and modified auxiliary data (assuming both dynamic locking steps were performed)
  • the retrieved data is decrypted
  • the output of step two is unmodified The result is the original auxiliary data
  • Correlated data may include information such as song lyrics or the address of the person in a photographic identification card
  • Fig 19 shows several example implementations of the modification part of dynamic locking and unlocking when data is embedded such that it will not be perceived (i e wate ⁇ narking)
  • the modified auxiliary information may be encrypted before being embedded and decrypted after being retrieved (but before being unmodified), if desired
  • the modification of the auxiliary information may be skipped, and the auxiliary information may be only encrypted before being embedded and decrypted after being retrieved
  • the cryptology process is not discussed in detail since someone familiar with the state of the art easily understands its implementation
  • Fig 19A shows dynamic locking and unlocking as applied to an apparatus earlier-described
  • the peak value, box 200, or threshold crossing value is used in the exclusive-or (XOR) calculation to modify the next N auxiliary information bits, where N is the number of bits per sample in the data (such as 16 bits for CD audio)
  • these modified N bits of the auxiliary information are optionally encrypted and embedded (e g , by the above-detailed methods) using locally masked bit manipulations of difference ⁇
  • This process is repeated for the next group of N peaks, and so on, until the whole modified auxiliary information is embedded or all the original data has been used with modified auxiliary information being repetitively embedded
  • the embedded data can be retrieved using the process described above, decrypted (if required), and unmodified
  • the unmodifying process is the inverse of the modifying process Since the XOR function is its own inverse, the peak values of the combined data and the decrypted auxiliary information are applied to the XOR function Importantly, the peak values are identical to those of the original data since they were not changed during the embedding process
  • N 16 bits
  • the first 16 bits of the auxiliary information are modified by the first peak value using the XOR
  • these modified auxiliary information bits are optionally encrypted and embedded in the data points after the current peak and the next 15 peaks
  • This process is repeated for the following group of 16 peaks and auxiliary information bits, and so on, until all the data is embedded or all the original data has been used
  • the modified and optionally encrypted auxiliary information can be embedded over an over again within the data, by restarting the process with the first 16 bits of the auxiliary information after all of the bits have been embedded
  • the embedded data can be retrieved, decrypted (if encrypted), and unmodified with the inverse of the XOR calculation, which is an XOR calculation
  • the first original 16 bits of the auxiliary information can be obtained by performing the XOR calculation with the retrieved and decrypted embedded data and first peak value
  • the retrieving process is continued for the next group of 16 peaks of the combined and embedded data, and so on, until the whole auxiliary information is found or all of the combined data has been traversed
  • a sync pulse could be embedded and used for re-alignment during the retrieval process
  • Fig 19B shows dynamic locking and unlocking as applied to Patent #5,774,452 "Apparatus and method for encoding and decoding information in audio signals" by Jack Wolosewicz of A ⁇ s Technologies, incorporated herein by reference
  • the data values occurring previously in time to the embedding of the pulse-width modulated (PWM) bit stream and shown in box 220 could be used in an XOR operation with auxiliary information to modify and unmodify the embedded data
  • PWM pulse-width modulated
  • Fig 19C shows an overview of applying dynamic locking and unlocking to embedded data schemes based upon pseudo-random noise (PN) sequences
  • the PN sequence could skip the M ft data point, as shown boxes 250 and 270, where M is equal to the number of bits per sample in the data (N) times the length in bits of PN sequence segment applied to each auxiliary information bit
  • M th data point would be used in an XOR operation with b bits of the auxiliary information to modify the auxiliary information
  • M 1024 bit segment of the PN sequence in 16 bit audio
  • the auxiliary information is 64 bits long
  • M 1024 bit PN segment* 16 bit audio
  • This modified and optionally encrypted auxiliary information can be used to control the fashion in which the PN sequence is added to the original data, as well known in the state of the art of spread spectrum technology Specifically, in many applications, the PN sequence will be phase shifted by the modified auxiliary information (l e where 0 scales and adds the negative value of the PN sequence and 1 scales and adds the positive value) or simply multiplied by the auxiliary information Once retrieved, the modified auxiliary information could be unmodified using the inverse XOR calculation with the skipped data point
  • PN sequences Another embodiment for PN sequences is using the skipped data point to modify the next N bits of the PN sequence, not the auxiliary information If one point is skipped, the number of PN bits modified, M, should be equal to N, the number of bits in the data If two points are skipped, M is equal to 2*N, and so on Modifying the PN sequence using an XOR calculation and optional encryption is one scheme However, this may reduce the randomness of the PN sequence, and other modification functions can be employed to maintain randomness Finally, the modified and optionally encrypted PN sequence is embedded in the media data and used to retrieve the embedded data
  • Fig 20 demonstrates applying dynamic locking and unlocking to data embedded in header, not content
  • data Fig 20A displays the pseudo-code for the dynamic locking process
  • the auxiliary data bits of length L
  • the process of Fig 20A starts at the beginning of the content bits (box 700) and auxiliary data bits (box 705)
  • L auxiliary data bits are locked by being modified with L bits of the content using the XOR or applicable function, and/or encrypted (box 735)
  • L content bits should be critical to the either or both file format and content, such that they cannot be replicated in a different media segment without disturbing it
  • M bits of locked auxiliary data are embedded in the frame header (box 710) These M bits should be less than L.
  • L is divisible by M, such that the L bits are embedded in L/M frame headers If L is not divisible by M, a person familiar with the state of the art can easily handle the offset Then, the content is checked to see if more frames exist (box 715) If there are no content frames left, the process is completed (box 730) If there are more content frames, the auxiliary data is checked to see if any previously modified bits exist (box 720) If there are previously modified auxiliary bits left, the next frame is read (box 725), and the process is continued at box 710 If there are no previously modified auxiliary bits left, the next content frame is read (box 740), the auxiliary data is re-started at bit 0 (box 705), and the process is continued at box 710 This process assumes the auxiliary information is of length L and L is reasonably short for ease of explanation It is obvious that if you have a very large number of auxiliary bits, you can break them into segments of length L, and rather than starting at the first auxiliary bit each time, start at the offset.
  • the auxiliary data bits are modified in each frame, specifically, between boxes 725 and 710 in Fig 20A
  • a PN sequence could be used to randomize which M bits of original audio are used
  • M must be large enough so that error correction in new content cannot repair all the content bits that need to be changed, such that a bit-for-bit transfer of the auxiliary data makes the new content appear authentic
  • the value of M depends upon the frame size and desired bit rate
  • compressed content such as MPEG data, specifically Layer III (MP3) or AAC audio as specified in the MPEG2 specifications, including the MPEG 1 and 2 specifications, ISO 1 1 172- 3 and ISO 13818-7 respectively, herein by reference
  • the frames and header bits are pre-defined
  • the private, copyright, or ancillary bits can thus be used to embed the data When using content without predefined frames, such as
  • the locked auxiliary data could be placed only in the global header, defined as the header for the complete file, or in a linked but separate file These two cases are less secure than embedding the data throughout the file More bits mean the data will be more robust to attack via brute force For broadcast content, the data should be embedded throughout the content as described above so the rendering device or person can receive the auxiliary information and respond accordingly from any point in the broadcast
  • Fig 20B displays the pseudo-code for the retrieval and dynamic unlocking process for the auxiliary data embedded in Fig 20A
  • the auxiliary data bits are retrieved by reading them from the header of the content frames and unlocking them, in a repetitive manner
  • the process of Fig 20B starts at the beginning of the content bits (box 750) and auxiliary data bits (box 755) Then, N content bits are saved, such as in memory 910 of Fig 22, so they can be used to unlock the next N retrieved auxiliary data (box 785)
  • M bits of locked auxiliary data are read from the frame header (box 760) Then, the content is checked for existing frames (box 765) If there arc no content frames left, the process is completed (box 780) If there are content frames left, the auxiliary data bits are checked to see any exist (box 770) If there are auxiliary bits left, the next frame is read (box 775), and the process is continued at box 760 If there are no auxiliary bits left, the retrieved auxiliary data is unlocked (box 790), the next frame is read (box 795), the auxiliary data is re-started at bit 0 (box 755), another N content bits are saved (box 785) and the process is continued at box 760 For this example, unlocking the retrieved
  • the same retrieved auxiliary bits and original content bits should be used in the inverse calculation as were used in the modifying calculation
  • the first M audio bits of the frame should be used to unmodify the modified auxiliary data, which was retrieved and decrypted
  • the auxiliary bits are retrieved accordingly For example, if bits are embedded in the global header or linked file, the are read from the global header or linked file, respectively
  • the process begins with a sending device 100, dynamically locking an ID 140 as shown in box 110. and embedding the locked ID within the media as shown in box 120
  • ID usually refers to an identifier, but can include any auxiliary information
  • the sending device 100 may be an encoder, recorder, transmitter, storage medium, or the like
  • the media is then transmitted to a receiving device 130 in which the locked ID is retrieved as shown in box 160, and dynamically unlocked as shown in box 170 Then, the proper action is enabled if allowed by the retrieved ID 140, as shown m box 180
  • the receiving device 130 may be a decoder, player, recorder, and/or the like
  • the encryption key must be located somewhere and transmitted safely, as shown in boxes 151 , 152, and 153 Transmitting the key safely is well understood by one familiar with the state of the art in cryptology
  • the location of the key depends upon the requirements of the utilization The five utilizations demonstrate various key locations For most utilizations, the key will be available only in one of the three possible locations
  • the encryption and decryption key will usually be identical (symmetric), and referred to as the encryption key in the discussion below
  • public/private key encryption could also be used in many of these situations When discussing private/public encryption below the key will be specified as the public or private encryption key
  • certain utilizations may not need
  • ID 140 the types of sending devices 100 and receiving devices 130 are also explained in more detail in these example utilizations
  • the five example utilizations include distribution of MP3 data, copy-once access to broadcast data, DVD copy protection, photo-card verification, and secret data transmission From these explanation, many more utilizations are obvious
  • the MP3 data exists on the Internet and is purchased by an end-user
  • the delivery system interacts with the end-user's player, securely transmitting ID 140 and the encryption key, shown box 151, from the receiving device to the sending device, and dynamically locking, including encryption, the ID 140 in the MP3 data
  • the encryption key, shown in box 151 is located on the end-user's player
  • the MP3 file is delivered (l e downloaded) only the end-user's player can play the data since other players will have different IDs
  • a portable and PC-based player may share the ID 140, and this is easily implemented by a software program and current digital electronics, such as EPROM or flash memory Since the ID 140 is dynamically locked the end-user cannot extract the ID 140 and use it in another song or MP3 file
  • the MP3 encoder and player may be part of one software program, which transforms CD, DVD or broadcast audio into MP3 audio with the embedded data containing the dynamically locked, including encryption, ID 140
  • the MP3 encoder and player may be part of one software program, which transforms CD, DVD or broadcast audio into MP3 audio with the embedded data containing the dynamically locked, including encryption, ID 140
  • the software applications should be programmed such that the key and ID 140 are protected from the end-user, as well known in the state of the art in software
  • the key shown in box 151
  • the transformed MP3 audio is now only playable on the end-user ' s system and/or portable player, and it is not possible to move the ID 140 to another song as described above
  • the key could be located in a central database, as shown in box 152
  • This configuration allows a different key for each player and MP3 audio sample This configuration increases robustness to attack since new keys are used for each song, but involves extra management tools and responsibilities
  • the ID 140 could contain time limits for listening to the audio or a date limit that, when exceeded, the audio will not play The player will keep track of how many times the song has been played or whether the date has expired
  • the ID 140 could contain a demo code, which does not limit the song to one player
  • copy-once access (defined as allowing an end-user to copy the media only once, perhaps for time shifting purposes), the concept is explained in terms of the broadcast of a movie With broadcast media, it is best if everyone shares the same encryption key
  • the key as shown in box 153, could be broadcast embedded in the movie, and changed for each broadcast
  • the copy-once ID 140 will be predefined, meaning that it is already defined in the transmitting and receiving device, as shown in Fig 21 where ID 140 has an optional location in the transmitting device
  • the recorder can record the movie and either remove the copy-once ID 140 or change the ID 140 to a predefined code that informs other recorders that the media has been copied once
  • the player will not play the media unless the embedded ID enables the action
  • the encryption key as shown in box 153, is be included on the DVD in a non-copy access location This means the user will be able to play the media only when the DVD disk is present since the player will not play the DVD data without retrieving the correct ID
  • a copy of the entire DVD (minus the encryption key since it is unable to be copied) or a copy of a content file will be unusable since the key to decrypt the embedded data will not be found and without it the player will not work
  • the key could be located in a centrally accessible database, and possibly linked to the requesting end-user player, as shown in box 152 This configuration increases robustness to attack since access to the key is monitored, but includes extra management responsibilities for the content provider and additional time for the end-user
  • the key could also be purchased and encrypted by the key in your player as described in U S patent 5,933,498 to Paul Schneck (incorporated herein by reference)
  • the ID 140 will be predefined, and exist in the sending device 100
  • the predefined ID 140 could be used to enable the recorder, and allow a certain number of copy generations, or a copy of only the original, known as serial copy management ID 140 could be modified to allow one less recording generation each time the DVD is recorded Possibly through keeping track of the recorded generation and originally allowed count or by reducing the allowed count
  • serial copy management the watermark could be removed in the second generation DVD Remember that in this approach if the watermark does not exist, no copies can be made Finally, there could be two-layered ID 140s for both tvpes
  • the photo-card utilization example involves having the picture in the photo-card embedded with the ID 140 If the correct information is not present, the card is a fake and will not be authorized for use
  • the ID 140 is reversibly modified by the photograph or connected data such as the corresponding name and address, and encrypted, such that the information cannot be copied between cards or from a legitimate card to an illegal card
  • the matching ID 140 and encryption key can be stored at a database only accessible by every sending device (I e in the sending device) and securely transmitted between the database and the photo-card reading device, such as using RSA key exchange or any other method known in the state of the art of cryptology Besides being as secure as other cryptology techniques, another advantage of this process is that it requires transmission of minimal data, including the short ID 140 and encryption key
  • the receiving device extracts the hidden message
  • the receiving device a connected device, or a human will be enabled by the hidden information contained in ID 140
  • the hidden information can be protected from being moved to other media segments and/or interpreted by using dynamic locking with various encryption schemes For example, if the secret information is encrypted with your public key, only you can recover it Or if it is encrypted with your private key, people or devices receiving the message using your public key know it was signed by you and is authentic If it is encrypted with a symmetric key, only the holders of the key could have created and read the message Finally, if the modification step of dynamic locking is used, the receiver knows the message was not transferred from a different media segment
  • Fig 22 shows an exemplary apparatus used to implement the enabling, registration, and dynamic locking processes
  • the hardware includes a logic processor 900 and memory 910
  • the logic processor 900 may be defined as the equivalent of a digital signal processor (DSP), general-purpose central processing unit (CPU), or a specialized ASIC chip
  • DSP digital signal processor
  • CPU general-purpose central processing unit
  • ASIC chip is one of the Texas Instruments TMS320 product line
  • a CPU could include one of Intel's Pentium line or Motorola/IBM's PowerPC product line
  • the design is straightforward for someone familiar with the state of the art given the description of these processes
  • the memory 910 includes any type of memory
  • Fig 23A shows more detail of the apparatus for dynamic locking Specifically, the logic processor 900 and memory 910 must work together to act as the modifier 1010 and encrypter 1040 Modifier 1010 performs the modification step of dynamic locking Encrypter 1040 performs the encryption step of dynamic locking
  • Fig 23B shows more detail of the apparatus for dynamic unlocking
  • the logic processor 900 and memory 910 must work together to act as the decrypter 1045 and the unmodifier 1015
  • the decrypter 1045 performs the decryption step of dynamic unlocking
  • the unmodifier 1015 performs the unmodifying step of dynamic unlocking
  • the unmodifier 1015 and decrypter 1045 of dynamic unlocking may use the same or different circuitry as the modifier 1010 and encrypter 1040 of dynamic locking However, when using the same circuitry, the dynamic locking and unlocking processes would use different control programs
  • Binding and ID Assignment As noted, another aspect of the technology detailed herein relates to media-binding, e g , the fashion in which consumers legitimately access protected content while controlling piracy
  • a basic concept is that the content contains an ID that locks it to a particular user or broadcast and the rendering device automatically determines whether the content can be accessed based upon the current and previously rendered IDs and rules
  • Such technology may result in increased sales of content for the content providers
  • One aspect of the technology resides in having the rendering device keeping track of the IDs contained in both the current and previously accessed content This allows the rendering device to control access to new content based upon the new content's ID, the rules provided with the content (by the content providers) and/or within the device, and the IDs from previously rendered content by the device
  • the ID may be linked to the user or the broadcast User IDs work well for content that is sold for a user's continued use, whereas broadcast IDs work well for content recorded by the user from a broadcast
  • the rendering device includes constraints that limit the number of content tracks with different user IDs that can be accessed in a certain amount of time, possibly influenced by the number of times content with each user ID has already been accessed
  • broadcast IDs and the optionally included rules can be used to limit rendering or copying of each broadcast
  • the limits are based upon date or number of times that ID is played, not on the total number of broadcast IDs
  • a portable MP3 player can keep track of each song's user ID, and if the previously played songs contain more than N different user IDs, the player decides if it can replace an old user ID with the new one due to the old user ID's date and number of times songs with that ID have been played
  • the MP3 player notes that the user has played the audio X times and Y times is allowed by the broadcast, or the date is past the broadcast's allowable usage date
  • the rendering device is a device that can play, view, record or perform a similar action upon the data
  • the rendering device can provide any type of perceived data, including but not limited to images, audio and video If the rendering device has a portable section, such as with a MP3 player, the loader, which puts the content onto the rendering device is considered as part of the rendering device
  • the ID may be a user or broadcast ID
  • many MP3 players can also record broadcasts, and these broadcasts will, in the future, contain embedded broadcast IDs, possibly as watermarks or header data with digital broadcasts
  • Content refers to the desired audio, video, image, or other relevant perceived data
  • Content providers include but are not limited to record labels, movie studios, and independent artists
  • the ID may be embedded within the content such as bits in the header file or a watermark, or the ID can be linked to the encryption and decryption of the content
  • this automatic ID management may be used in conjunction with other methods, such as media-binding
  • Fig 25 displays an overview of ane automatic ID management process
  • the rendering device 100 keeps track of the IDs contained within the content it has previously accessed (box 110)
  • the rules 120 may be provided in the device hardware and/or contained with the content
  • the rules 120 decide whether or not the device can access the new content based upon its ID (box 130) If the rendering device has a portable section, such as with a MP3 player, the loader, defined above as part of the rendering device, can be used to lower the amount of memory required within the portable section, thus lowering its costs
  • the portable section may contain all of the memory and processing hardware (described in detail below) required to perform this automatic ID handling, or the hardware may be split between the loader and portable section
  • the loader may store all the information about IDs on the computer and all the rendering device needs to do is count the number of times each song is played and maintain date information for its current list of content
  • the rules 120 include constraints 245, which are contained within the content as specified by the content provider, as well as default rules contained with the rendering device hardware
  • the constraints 245 are retrieved from the content 200 (box 240)
  • the constraints 245 may limit the number of content tracks with different IDs that a device can access during a set time-period
  • the constraints 245 may also change the time-period an ID is stored dependent upon the number of times content with a specific ID was accessed
  • the constraints 245 may be embedded within the content or attached as a header information or a linked file
  • the hardware includes a logic processor 300 and a memory 310
  • the logic processor 300 may be defined as the equivalent of a digital signal processor (DSP), general-purpose central processing unit (CPU), or a specialized CPU, including media processors
  • DSP digital signal processor
  • CPU general-purpose central processing unit
  • a specialized CPU including media processors
  • a likely DSP chip is one of the Texas Instruments TMS320 product line
  • a CPU could include one of Intel's Pentium line or Motorola/IBM's PowerPC product line
  • the design of code for controlling logic processor 300 is simple for someone familiar with the state of the art given the above pseudo-code and description
  • a person familiar with the state of the art could implement the logic processor 300 using analog and digital circuitry, either separate or in an application specific integrated circuit (ASIC)
  • ASIC application specific integrated circuit
  • the memory 310 stores the information required by rules 120, such as IDs, last play date, and the number of times that content with each ID has been accessed
  • Memory 310 may consist of standard computer random access memory (RAM) It is also desirable if memory 310 maintains this information even without power in the rendering device, perhaps but not limited to using ROM with backup and chargeable battery power, or memory that is stable without power, such as EPROM As discussed above, memory 310 may consist of two separate modules when using a portable section and loader
  • an ID 210 is retrieved from the content 200.
  • the ID 210 is checked to see if it is a user or broadcast ID (box 215) For user IDs, the following happens If the ID 210 already exists in the memory 310 of device
  • the play count and last access date are updated (box 222) and the content 200 is rendered (box 230)
  • the rules 120 are checked If another ID can exist in memory 310 (box 250), ID 210 and the current date are added to the memory 310 (box 260) and the content is rendered (box 230) If another ID cannot be added, the rules 120 are checked to see if any existing IDs can be replaced because they are too old (box 270) If any IDs can be replaced, the old ID is replaced with ID 210 (box 280) and the content is rendered (box 230) If no IDs can be replaced, the user may be warned and access to content 200 is denied or limited (box 290) The user may also be presented with a link to buy the content (box 290)
  • the rules may allow a device to store 10 IDs, and IDs can be replaced if they have not been accessed for a week
  • the number of times an ID has been rendered could be used to determine whether or not to replace the old ID with a new one (box 270)
  • This count value could influence the time period an ID is held is memory 310, thus allowing ID 210 to replace a stored ID (boxes 270 and 280) For example, if content associated with the stored ID has not been accessed in a week, it can be replaced Conversely, if content associated with the stored ID has been played at least 7 times, it should be held for at least a month since its last access
  • the count for an ID can be reduced by one each day and incremented by one for each rendering of content containing the ID, and the ID can be replaced (box 270) if the count is zero or less, or the date of last access is over a week
  • the ID 210 is examined to see if it already exists in memory 310 (box 255) If not, the ID 210 and current date are added to the rendering devices memory
  • the content is rendered (box 230)
  • the play count, record date and/or last access date are checked to see if the content can be rendered (box 275)
  • the broadcast may allow only two renders, or one week of rendering, or rendering until a specific date If the broadcast is allowed to be rendered, the count and last access date are updated (box 285) and the content is accessed (box 230) If the broadcast is not allowed to be rendered, the user is notified, the access is limited and a link to buy the broadcast or similar content may be provided, if applicable (box 295)
  • the reset function may require a password code that is pseudo-random, thus requiring the user to contact support to reset the device For example, the password may depend upon the day and year and obtained from an automation system
  • the reset button may also delete all the current content as well as ID information This allows people to use one portable player with many friends at a party, but the loss of content will discourage piracy since it will be cumbersome
  • Figure 28 shows a portable MP3 player 400 that contains the described apparatus implementing the described pseudo-code
  • the logic processor 300 could be a separate processor, or share access with the processor that decompresses the audio
  • the device also contains the necessary memory
  • the device may share this memory with a software loader
  • the logic processor 300 could be a separate processor or share time with the processor handling content for the device, such as compressing or decompressing digital content
  • one watermark is robust and declares that the media is protected This watermark is embedded when the media is encoded into the desired format, such as MP3 This means that the intensity of adding the watermark is not an issue because the watermark is only added to the audio once, and copied with the audio by the distributor
  • the other watermark declares that it is okay to play or record the media It is efficient, and does not need to be difficult to remove, since removing it produces no advantageous results
  • the efficiency of this watermark is desirable since it must be embedded each time the audio reproduced, such as downloaded on the Internet, to link the media to the user, player, recorder and/or storage device Thus, it greatly reduces the cost of copy management for the distributor In addition, it lowers the cost of the players, since usually they only have to find this efficient watermark Only when it does not exist, does the player need to determine if the audio is protected with the robust but computationally intense w atermark
  • non-protected media may contain neither watermark and can be played by any device from any storage
  • Fig 29 displays a process employing two watermarks
  • Media 100 exists in an insecure format, meaning that devices can play the media 100 even if it does not contain any copy protection and/or authentication watermarks It is a format in which some artists wish to freely distribute their content, such as MP3
  • redistributed Watermark 110 declares that the media is protected Watermark 110 must be extremely difficult to remove, and is allowed to be computationally intense
  • Many existing watermark methods meet this description, and future ones will certainly be designed
  • Watermark 120 links the media to the user, player, recorder and/or storage device This link determines if the user may copy and/or play the media Watermark 120 must be a computationally efficient method that is hard to imitate
  • Watermark 110 is embedded when the audio is encoded, and copied with the audio when distributed
  • the computational intensity of adding the watermark is not that important Watermark 120 is embedded when the media is reproduced, such as being distributed, placed on permanent storage, or encoded from an alternative form by a personal encoding device
  • reproduced refers to the legal transformation or distribution of the media
  • copying refers to an individual producing an exact bit-for-bit replication of the media for legal or illegal utilization Since watermark 120 is embedded every time the media is reproduced, its efficiency creates a reduction in cost Since watermark 120 is embedded after watermark 110 it must be okay to layer the watermarks, as known to be possible with existing technology
  • the watermarks are search and retrieved in a specific order, as shown in Figs 29 and 31
  • the media is searched for watermark 120 (box 300) If watermark 120 is retrieved (box 310) the embedded information is evaluated (box 320) If the embedded information is correct, the desired action is enabled (box 330) Alternatively, if the embedded information is not correct, the desired action is disabled (box 340) Only if watermark 120 is not found does the media need to be searched for the computationally intense watermark 110 (box 350) If watermark 110 declares the media protected, then the desired action is disabled (box 340) If watermark 120 is not present (or declares the media to be free), the desire action is allowed (box 330)
  • the just-detailed process can be used to restrict copying and/or playing of the media
  • Fig 32 shows hardware apparatus that may be used to implement the invented processes
  • the hardware includes a logic processor 400 and a storage unit 410
  • the logic processor 400 may be defined as the equivalent of a digital signal processor (DSP), general-purpose central processing unit (CPU), or a specialized CPU, including media processors
  • DSP digital signal processor
  • CPU general-purpose central processing unit
  • a specialized CPU including media processors
  • a likely DSP chip is one of the Texas Instruments TMS320 product line
  • a CPU could include one of Intel's Pentium line or Motorola/IBM's PowerPC product line
  • the design is simple for someone familiar with the state of the art given the above pseudocode and description
  • the storage unit 410 includes RAM when using a digital processor
  • a person familiar with the state of the art could alternatively implement the process with analog and digital circuitry, either separate or in an application specific integrated circuit (ASIC)
  • the analog and digital circuitry could include any combination of the following devices a digital-to-analog converter (D/A), comparators, sample-and-hold circuits, delay elements, analog-to-digital converter (A/D), and programmable logic controllers (PLC) Programmable logic arrays (PLDs) can likewise be used
  • the detection criteria may include the relationship between several points, or be as simple as a threshold crossing or include every M th point
  • the adjustment of the neighboring points may be as simple as multiplying the point after the threshold crossing by N It is advantageous if N is less than one but not equal to zero so saturation and data points equal to zero are not a problem, and if the threshold is positive and the data is decreasing towards zero during the threshold crossing
  • the process can include searching through the data for the detection criteria and then readjusting neighboring points to their original value For example, if the adjustment in the degradation process uses multiplication by N, the recovery process multiplies by 1/N
  • digital content refers to digital data representing a perceived physical item, including but not limited to audio, video, and images
  • Digital data refers to the grouping of bits ( l 's or 0's) that represent a sample of the original digital content at an instant in time Each bit group is equivalently referred to as a data point or sample
  • the data points are arranged in an order, many times representing a sequence versus time or frequency
  • the data points may be grouped again to form a subgroup, possibly used to represent a sequence versus frequency versus time, as is the case in MPEG standard compressed digital audio and video
  • the digital data has an order, with a beginning and end, such that searching the data is possible, and neighboring points can be defined as points close to each other
  • po ⁇ nt(s) refer to one or several points
  • Fig 36 displays an overview of the degradation and recovery process
  • Fig 37 displays the corresponding pseudocode to be implemented by the apparatus
  • the samples are searched for the detection criteria (boxes 200, 210 and 220)
  • the searching stops after the last data point in the buffer has been examined (box 210), and a new buffer may be presented if available
  • data values must be saved between buffers and properly initialized for the first buffer so as the initial points are properly searched
  • the neighboring data po ⁇ nt(s) are adjusted so as to cause content degradation (box 230)
  • the adjustment of these points should not change the location of the detection criteria or change it in a known fashion, otherwise, the detection of the correct location to readjust the data to its original value (recovery) is not easy
  • the degraded data is searched for the detection criteria defined by the degradation process (box 200, 210, and 220) If the degradation process has changed the detection criteria in a known fashion, then the detection criteria in box 220 for recovery is different than that used in degradation When the criteria location is found, the neighboring data po ⁇ nt(s) are re-adjusted by the inverse of the method used in the degradation process (box 230)
  • the detection criterion is a threshold crossing (using c-notation x[n-l ]>thr && x[n] ⁇ thr) with a positive threshold (thr>0) while the data goes towards zero (boxes 400, 410 and 420)
  • the neighboring po ⁇ nt(s) include only the point after the threshold crossing (box 430)
  • the adjustment involves multiplying the data point after the threshold crossing (x[n]) by N, where N is less than 1 (box 430)
  • N is less than 1
  • the detection criteria do not change between degrading and recovering the original digital data, this is not a requirement
  • the detection criteria may change, if in a known fashion, such that the recovery process uses a different (but known) detection criteria than the degradation process
  • box 420 or 220, as discussed above
  • the original content need not be represented by digital samples versus time, as one may have assumed
  • the digital samples represent subgroups of frequencies versus time
  • the data may be searched across frequency for each subgroup, or across time for each frequency, or in any other but well defined combination
  • the data may also represent either the frequency magnitude or corresponding scaling factors
  • Fig 40 shows illustrative hardware used to implement the described degradation and recovery processes
  • the hardware includes a logic processor 500 and a storage unit 510
  • the logic processor 500 may be defined as the equivalent of a digital signal processor (DSP), general-purpose central processing unit (CPU), or a specialized CPU, including but not limited to media processors
  • DSP digital signal processor
  • CPU general-purpose central processing unit
  • a specialized CPU including but not limited to media processors
  • a likely DSP chip is one of the Texas Instruments TMS320 product line
  • a CPU could include one of Intel's Pentium line or Motorola/IBM's PowerPC product line
  • the design of code for controlling logic processor 500 is simple for someone familiar with the state of the art given the above pseudo-code and description
  • the storage unit 510 includes RAM when using a digital processor, and is required to store the current buffer and/or previous po ⁇ nt(s) for the detection criteria
  • analog and digital circuitry could include any combination of the following devices digital-to-analog converters (D/A), comparators, sample-and-hold circuits, delay elements, analog-to-digital converters (A/D), and programmable logic controllers (PLC)
  • a process is provided the avoids scrambling the header or other important information about the content
  • An advantage to leaving the header alone is that applications or devices can quickly read information about the content before de- scrambling and accessing the content
  • applications or devices can quickly read information about the content before de- scrambling and accessing the content
  • the header may contain copyright information that the player is required to check before playing
  • the scrambling process scrambles some or all of the non-header content If only some of the non- header content is scrambled, it must be more than error correction, if present, can repair
  • MPEG Motion Pictures Expert Group
  • the header of each frame is avoided while scrambling some or all of the non- header content
  • the de-scramblmg process recovers the original content from the scrambled content, similarly avoiding the header information
  • An exemplary process involves using a pseudo-random noise (PN) sequence and the XOR function to scramble the content while avoiding the headers of each frame
  • PN pseudo-random noise
  • the header of a file contains important information about the file This information may include the type of file, author, place of origin, date of origin, last modified date, file size, structure allocations, copyright codes, unique IDs, usage rules, etc
  • the header may exist only at the beginning of the file, at the beginning of frames within the file, or both Frames are common for compressed digital media, such as MPEG audio and video More specifically, with MP3 data, the header may include what the MPEG standard labels as header, error correction and side information In addition, if the content does not contain frames or a header, such data can easily be created in a new structured file format
  • Fig 33a displays an overview of the scrambling process If the file has only a global header or frames of known size without synchronization (sync) codes, the headers are located and skipped (box 105) during the scrambling step (box 110) In other words, there is no reason to look for the sync code (box 100)
  • the scrambling step may scramble part or all of the non-header content If the file is broken into frames with additional sync codes, the sync codes that define the frames are found (box 100), the header information is skipped (box 105) and the content is scrambled (box 110) Usually, the header contains information about the frame's size, which aids in locating the next sync code, as the sync code may also randomly occur within the data Once again, the scrambling step may scramble part or all of the non-header content
  • the scrambling step can consist of methods used in the prior art Standard modern encryption, such as DES or RSA, is an excellent choice With this encryption, although one may be able to de- scramble one file by brute-force, another file can remain secure even when using the same key
  • Other scrambling options may include simple mathematical operations with a PN sequence, such as multiplication, addition, subtraction, or exclusive-or (XOR) Division should be used carefully since it may cause bit error due to the imprecise nature of limited bit-length division
  • Fig 33b displays an overview of the de-scrambling process
  • De-scrambling is the inverse of scrambling, and only the content bits that were scrambled should be de-scrambled
  • the file has only a global header or frames of known size without synchronization (sync) codes
  • the headers are located and skipped (box 155) during the de-scrambling process (box 160)
  • sync code box 150
  • the header information is skipped (box 155) and part or all of the remaining content in that frame is de-scrambled
  • the header contains information about the frame's size, which aids in locating the sync code, as the sync code may not be unique and, thus, occur within the data
  • the de-scrambling step may de-scramble part or all of the non-header content, depending upon what was scramble
  • the de-scrambler should use the inverse of the function used by the scrambler
  • the de-scrambler requires a decryption key, and the key may be different than the encryption key
  • the key is expected to remain the same for many frames and most likely for the whole content track, where a track can consist of a song or movie
  • the key will probably be changing each track, and there are many ways to send keys for someone familiar with the state of the art in cryptology
  • the key for the PN sequence is the generator function, and it does not change for each MP3 song, I e defined as a
  • Fig 34a displays the pseudo-code for an example of the scrambling or de-scrambling process
  • the content contains frames that begin with a sync code, and a header exists for each frame Since the inverse of the XOR function is itself, the pseudo-code for the scrambling and de- scrambling process is identical
  • Content scrambled or de-scrambled by this simple example could include MPEG audio data, such as Layer III (MP3) or AAC MPEG audio's sync code is '1 1 1 1 1 1 1 1 1 1 1 1 '
  • MP3 Layer III
  • AAC MPEG audio's sync code is '1 1 1 1 1 1 1 1 1 1 1 1 1 '
  • the process begins at the beginning of the content (box 200) Then, the sync code is found, usually being the first few bits of the content (box 205) Next, the header data is skipped, possibly reading its own size from data after the sync code (box 210) Then, the M content bits for that frame are scrambled using an XOR operation with the M content bits and M bits of a PN sequence (box 215) Fig 34b shows the input and output for the XOR function Next, the content is checked to see if another frame exists (box 220) If another frame exists, the process continues at box 205 where the next sync code is located Usually, the size of the frame can be read from the frame's header, which aids in searching for the next sync code If no content remains, the process is complete (box 225) In this example, the size of M determines the robustness to brute-force attack, where the attacker's purpose is to obtain the original content The larger M, the more robust the scrambled content is to attack However, the smaller M, the more efficient the
  • Fig 35 shows hardware suitable for use in implementing the scrambling or de-scrambling process
  • the hardware includes a digital logic processor 300 and a digital memory 310
  • the logic processor performs the calculations and logic for this process
  • the logic processor 300 may be defined as the equivalent of a digital signal processor (DSP), general-purpose central processing unit (CPU), a specialized CPU, including media processors, or application specific circuitry (ASIC)
  • DSP digital signal processor
  • CPU general-purpose central processing unit
  • ASIC application specific circuitry
  • a likely DSP chip is one of the Texas Instruments TMS320 product line
  • a CPU could include one of Intel's Pentium line or Motorola/IBM's PowerPC product line
  • the ASIC can easily be designed by someone familiar with the state of the art and the above pseudo-code and description
  • the design of code for controlling logic processor 300 is also simple for someone familiar with the state of the art given the above pseudo-code and description
  • the memory 310 includes RAM when using a digital processor, and is used to store the
  • a payload of N bits may be encoded as M bits, where M>N (l e with partial or complete redundancy)
  • the redundancy can include repetition of the N bits payload through the content, BCH-. convolutional-, turbo- , etc-coding of the N-bits to provide robustness and/or error correction, CRC or ECC codes, etc
EP00916232A 1999-03-10 2000-03-10 Verfahren zur signalverarbeitung, vorrichtungen und anwendungen zur verwaltung digitaler rechte Withdrawn EP1157499A4 (de)

Applications Claiming Priority (13)

Application Number Priority Date Filing Date Title
US12358199P 1999-03-10 1999-03-10
US12358799P 1999-03-10 1999-03-10
US123581P 1999-03-10
US123587P 1999-03-10
US12659199P 1999-03-26 1999-03-26
US12659299P 1999-03-26 1999-03-26
US126592P 1999-03-26
US126591P 1999-03-26
US404292 1999-09-23
US404291 1999-09-23
US09/404,291 US7055034B1 (en) 1998-09-25 1999-09-23 Method and apparatus for robust embedded data
US09/404,292 US7197156B1 (en) 1998-09-25 1999-09-23 Method and apparatus for embedding auxiliary information within original data
PCT/US2000/006296 WO2000054453A1 (en) 1999-03-10 2000-03-10 Signal processing methods, devices, and applications for digital rights management

Publications (2)

Publication Number Publication Date
EP1157499A1 true EP1157499A1 (de) 2001-11-28
EP1157499A4 EP1157499A4 (de) 2003-07-09

Family

ID=27558005

Family Applications (1)

Application Number Title Priority Date Filing Date
EP00916232A Withdrawn EP1157499A4 (de) 1999-03-10 2000-03-10 Verfahren zur signalverarbeitung, vorrichtungen und anwendungen zur verwaltung digitaler rechte

Country Status (6)

Country Link
EP (1) EP1157499A4 (de)
JP (1) JP2002539487A (de)
KR (1) KR100746018B1 (de)
AU (1) AU3736800A (de)
CA (1) CA2364433C (de)
WO (1) WO2000054453A1 (de)

Families Citing this family (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7373513B2 (en) 1998-09-25 2008-05-13 Digimarc Corporation Transmarking of multimedia signals
US8091025B2 (en) 2000-03-24 2012-01-03 Digimarc Corporation Systems and methods for processing content objects
KR100548983B1 (ko) * 2000-11-02 2006-02-02 (주)마크텍 디지털 증명서의 발급 및 인증을 위한 텍스트의 삽입 방법및 장치
US7124442B2 (en) 2001-07-25 2006-10-17 440 Pammel, Inc. System and method for insertion and retrieval of microthreads in transmitted data
US20030069853A1 (en) * 2001-10-04 2003-04-10 Eastman Kodak Company Method and system for managing, accessing and paying for the use of copyrighted electronic media
JP3867642B2 (ja) 2002-08-28 2007-01-10 ヤマハ株式会社 楽音再生用デジタルデータの情報処理装置、情報処理方法、プログラム及び記憶媒体
KR100965437B1 (ko) * 2003-06-05 2010-06-24 인터트러스트 테크놀로지즈 코포레이션 P2p 서비스 편성을 위한 상호운용 시스템 및 방법
US7646881B2 (en) * 2003-09-29 2010-01-12 Alcatel-Lucent Usa Inc. Watermarking scheme for digital video
TW200638335A (en) * 2005-04-13 2006-11-01 Dolby Lab Licensing Corp Audio metadata verification
WO2007035817A2 (en) 2005-09-20 2007-03-29 Celodata, Inc. A method, system and program product for the insertion and retrieval of identifying artifacts in transmitted lossy and lossless data
US8566858B2 (en) 2005-09-20 2013-10-22 Forefront Assets Limited Liability Company Method, system and program product for broadcast error protection of content elements utilizing digital artifacts
US8966517B2 (en) 2005-09-20 2015-02-24 Forefront Assets Limited Liability Company Method, system and program product for broadcast operations utilizing internet protocol and digital artifacts
US8566857B2 (en) 2005-09-20 2013-10-22 Forefront Assets Limited Liability Company Method, system and program product for broadcast advertising and other broadcast content performance verification utilizing digital artifacts
US10269086B2 (en) 2008-10-09 2019-04-23 Nagra France Sas Method and system for secure sharing of recorded copies of a multicast audiovisual program using scrambling and watermarking techniques
AR077680A1 (es) 2009-08-07 2011-09-14 Dolby Int Ab Autenticacion de flujos de datos
US8407808B2 (en) * 2010-05-27 2013-03-26 Media Rights Technologies, Inc. Security thread for protecting media content
TWI800092B (zh) * 2010-12-03 2023-04-21 美商杜比實驗室特許公司 音頻解碼裝置、音頻解碼方法及音頻編碼方法
US20140204994A1 (en) * 2013-01-24 2014-07-24 Silicon Image, Inc. Auxiliary data encoding in video data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5721788A (en) * 1992-07-31 1998-02-24 Corbis Corporation Method and system for digital image signatures

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5646997A (en) * 1994-12-14 1997-07-08 Barton; James M. Method and apparatus for embedding authentication information within digital data
US5943422A (en) * 1996-08-12 1999-08-24 Intertrust Technologies Corp. Steganographic techniques for securely delivering electronic digital rights management control information over insecure communication channels
EP0766468B1 (de) * 1995-09-28 2006-05-03 Nec Corporation Verfahren und Vorrichtung zum Einfügen eines Spreizspektrumwasserzeichens in Multimediadaten
US5859920A (en) * 1995-11-30 1999-01-12 Eastman Kodak Company Method for embedding digital information in an image
DE69739969D1 (de) * 1996-06-20 2010-10-07 Ibm Verfahren zum verstecken von daten
US6061793A (en) * 1996-08-30 2000-05-09 Regents Of The University Of Minnesota Method and apparatus for embedding data, including watermarks, in human perceptible sounds
TW312770B (en) * 1996-10-15 1997-08-11 Japen Ibm Kk The hiding and taking out method of data
WO1998016927A1 (fr) * 1996-10-16 1998-04-23 International Business Machines Corporation Procede d'enregistrement de donnees de support sur un support d'enregistrement et procede et systeme permettant d'acceder aux donnees de support enregistrees sur ledit support
CA2265647C (en) * 1996-10-16 2003-09-23 International Business Machines Corporation Method and system for managing access to data through data transformation
JP3281561B2 (ja) * 1996-12-25 2002-05-13 シャープ株式会社 モータ速度制御装置
US5875249A (en) * 1997-01-08 1999-02-23 International Business Machines Corporation Invisible image watermark for image verification
US6141753A (en) * 1998-02-10 2000-10-31 Fraunhofer Gesellschaft Secure distribution of digital representations
US6021196A (en) * 1998-05-26 2000-02-01 The Regents University Of California Reference palette embedding
JP2003505895A (ja) * 1998-09-10 2003-02-12 マークエニー・インコーポレイテッド ウェーブレット及び離散コサイン変換を用いたディジタルイメージのウォーターマーキング方法

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5721788A (en) * 1992-07-31 1998-02-24 Corbis Corporation Method and system for digital image signatures

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HARTUNG F ET AL: "SPREAD SPECTRUM WATERMARKING: MALICIOUS ATTACKS AND COUNTERATTACKS" , PROCEEDINGS OF THE SPIE, SPIE, BELLINGHAM, VA, US, VOL. 3657, PAGE(S) 147-158 XP000949145 * page 147 * * figure 1 * *
See also references of WO0054453A1 *

Also Published As

Publication number Publication date
JP2002539487A (ja) 2002-11-19
EP1157499A4 (de) 2003-07-09
KR100746018B1 (ko) 2007-08-06
CA2364433A1 (en) 2000-09-14
WO2000054453A9 (en) 2002-07-04
WO2000054453A1 (en) 2000-09-14
CA2364433C (en) 2011-07-19
KR20020022131A (ko) 2002-03-25
AU3736800A (en) 2000-09-28

Similar Documents

Publication Publication Date Title
US8095795B2 (en) Methods and apparatus for robust embedded data
Petitcolas et al. Attacks on copyright marking systems
CA2364433C (en) Signal processing methods, devices, and applications for digital rights management
EP1256086B1 (de) Verfahren und vorrichtung zum mehrschichtigen data-hiding
JP5475160B2 (ja) デジタルホストコンテンツに埋め込まれた透かしの検出に対するシステム反応
EP1110400B1 (de) Verfahren und system zum schutz digitaler daten vom nichtauthorisiertem kopieren
Lacy et al. Intellectual property protection systems and digital watermarking
EP1259961B1 (de) System und verfahren zum schutz von digitalen medien
US20050177727A1 (en) Steganographic method and device
US8942416B2 (en) Method and apparatus for embedding auxiliary information within original data
JP2011229156A (ja) マルチメディアコンテンツ用の電子透かしのセキュリティ強化
JP2004164580A (ja) 情報信号へのデータの埋込
EP1346358A1 (de) System und verfahren zum einfügen von unterbrechungen in zusammengeführte digitale aufzeichnungen
Ahuja et al. A survey of digital watermarking scheme
AU2004235685A1 (en) Signal processing methods, devices, and applications for digital rights management
Steinebach et al. Audio watermarking and partial encryption
George Spread spectrum watermarking for images and video.
KR20110046922A (ko) 음원 보안을 위한 디지털 오디오 워터마킹 장치 및 방법

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20010808

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE

AX Request for extension of the european patent

Free format text: AL;LT;LV;MK;RO;SI

A4 Supplementary search report drawn up and despatched

Effective date: 20030528

RIC1 Information provided on ipc code assigned before grant

Ipc: 7G 11B 20/00 B

Ipc: 7H 04N 7/24 A

Ipc: 7H 04N 1/32 B

Ipc: 7H 04N 7/26 B

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN WITHDRAWN

18W Application withdrawn

Effective date: 20030808